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Preface 



This book contains a selection of papers presented at the second annual workshop 
held under the auspices of the Esprit Working Group 21900 Types. The workshop 
took place in Irsee, Germany, from 27 to 31 of March 1998 and was attended by 
89 researchers. 

Of the 25 submissions, 14 were selected for publication after a regular refer- 
eeing process. The final choice was made by the editors. 

This volume is a sequel to the proceedings from the first workshop of the 
working group, which took place in Aussois, France, in December 1996. The 
proceedings appeared in vol. 1512 of the LNGS series, edited by Ghristine Paulin- 
Mohring and Eduardo Gimenez. 

These workshops are, in turn, a continuation of the meetings organized in 
1993, 1994, and 1995 under the auspices of the Esprit Basic Research Action 
6453 Types for Proofs and Programs. Those proceedings were also published 
in the LNGS series, edited by Henk Barendregt and Tobias Nipkow (vol. 806, 
1993), by Peter Dybjer, Bengt Nordstrom and Jan Smith (vol. 996, 1994) and 
by Stefano Berardi and Mario Goppo (vol. 1158, 1995). The Esprit BRA 6453 
was a continuation of the former Esprit Action 3245 Logical Frameworks: De- 
sign, Implementation and Experiments. The articles from the annual workshops 
organized under that Action were edited by Gerard Huet and Gordon Plotkin 
in the books Logical Frameworks and Logical Environments, both published by 
Gambridge University Press. 
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Introduction 

The original motivation^ for the work described in this paper was to deter- 
mine the proof theoretic strength of the type theories implemented in the proof 
development systems Lego and Coq, [12,4]- These type theories combine the im- 
predicative type of propositions^, from the calculus of constructions, [5], with 
the inductive types and hierarchy of type universes of Martin-L6f ’s constructive 
type theory, [13]. Intuitively there is an easy way to determine an upper bound 
on the proof theoretic strength. This is to use the ‘obvious’ types-as-sets in- 
terpretation of these type theories in a strong enough classical axiomatic set 
theory. The elementary forms of type of Martin-Lof’s type theory have their 
familiar set theoretic interpretation, the impredicative type of propositions can 
be interpreted as a two element set and the hierarchy of type universes can 
be interpreted using a corresponding hierarchy of strongly inaccessible cardinal 
numbers. The assumption of the existence of these cardinal numbers goes be- 
yond the proof theoretic strength of ZFC. But Martin-Lof’s type theory, even 
with its W types and its hierarchy of universes, is not fully impredicative and 
has proof theoretic strength way below that of second order arithmetic. So it 
is not clear that the strongly inaccessible cardinals used in our upper bound 
are really needed. Of course the impredicative type of propositions does give a 
fully impredicative type theory, which certainly pushes up the proof theoretic 
strength to a set theory^, Z“, whose strength is well above that of second or- 
der arithmetic. The hierarchy of type universes will clearly lead to some further 
strengthening. But is it necessary to go beyond ZFC to get an upper bound? 

* This paper was written while on sabbatical leave from Manchester University. I 
am grateful to my two departments for making this possible. I am also grateful to 
Nijmegen University Computer Science Department for supporting my visit there. 
Some of the ideas for this paper were developed during that visit. 

^ The same motivation may be found in [15]. More or less the same tools are used 
there as here; i.e. the types-as-sets and sets-as-trees interpretations. But that paper 
focuses on slightly different results to the ones obtained here. 

^ Here we will ignore the use of any rules for putting types other than 77 types into 
the impredicative type of propositions. 

® The theory Z~ is obtained from Zermelo set theory, Z, by only using formulae with 
restricted quantihers in the separation axiom scheme 
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Surprisingly perhaps, the types-as-sets interpretation"^ has hardly been stud- 
ied systematically®. So it is the main aim of this paper to start such a systematic 
study. In section 2 we first present some of the details of the TS interpretation 
of a type theory M that is a reformulation of Martin-L6f ’s extensional type 
theory with W types but no type universes. This interpretation is carried out 
in the standard axiomatic set theory ZFC and so gives a proof theoretic reduc- 
tion of MLW®’'* to ZFC. Of course this result is much too crude and we go on in 
section 2 to describe two approaches to getting a better result. 

The first approach is to make the type theory classical by adding the natural 
formulation of the law of excluded middle. It turns out that to carry through 
the interpretation we need to strengthen the set theory by adding a global form 
of the axiom of choice and we get a proof theoretic reduction of MLW®’'* -F EM 
to ZFGC. Fortunately it is known that the strengthened set theory is not proof 
theoretically stronger, so that we do get a reduction of MLW®^* + EM to ZFC. 

Section 2 ends with the second approach, which is to replace the classical set 
theory by a constructive set theory, CZF+, that is based on intuitionistic logic 
rather than classical logic. So we get a reduction of MLW®’'* to CZF+. 

In section 3 we extend the results of section 2 by adding first a type universe 
reflecting the forms of type of M and then an infinite cumulative hierarchy 
of such type universes. To extend the TS interpretation to the resulting type 
theories we use, in classical set theory, strongly inaccessible cardinal numbers 
for the type theories with EM, and in constructive set theory, inaccessible sets as 
introduced in [11]. Finally in section 3, we formulate type theories having rules 
for the impredicative type of propositions of the calculus of constructions and 
formulate corresponding axioms of constructive set theory and again describe 
how each of these type theories has a TS interpretation into a corresponding set 
theory. 

In section 4 we briefly describe how the sets-as-trees interpretation ® of 
CZF into the type theory MLWU, first presented in [1] and then developed fur- 
ther in [2,3,10,11], extends to the other set theories, giving reductions to the 
corresponding type theories with an extra type universe. Fortunately each type 
theory with an infinite hierarchy of type universes is proof theoretically as strong 
as the type theory with a type universe added on top, so that we end up with 
results stating that to each of the type theories we consider that have an infi- 
nite hierarchy of type universes there is a corresponding set theory of the same 
proof theoretic strength. In particular the type theory MLWPU<(,j, that is our 
aproximation to the type theories implemented in Lego and Coq, has the same 
proof theoretic strength as the set theory CZF+pu^^. This last result does not 
solve the original problem motivating our work as the set theory is unfamiliar. 
Nevertheless I think that it does give a new handle on the problem. The new 
set theory is an interesting one and I plan to present some results about it on a 
future occasion. 

^ Here abbreviated TS interpretation. 

® But see [6,7,8,15]. 

® Here abbreviated ST interpretation. 
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In section 1 we set up our particular approach to the syntax of our type 
theories and the TS interpretation of them. We have tried to make this as simple 
as possible. We have preferred to focus on extensional Martin-L 6 f type theories 
having extensional equality types Eq{A, oi, 02) for the TS interpretation, as the 
rules for these types are easily seen to be sound. We have also added equality 
types EQ{Ai, A2) for the same reason. For the reverse ST interpretation these 
equality types are not needed, but nor are any intensional equality types needed, 
so we can simply drop the extensionality rules. 

In this paper we claim various results about proof theoretic reductions be- 
tween formal systems. When we have reductions both ways we write that the 
formal systems have the same proof theoretic strength. What do we mean by 
such claims? Here I will only be concerned with the relatively weak notion of 
reduction given by a finitistic relative consistency proof. It is standard practice 
to take the quantifier free theory PRA of primitive recursive arithmetic to codify 
finitistic mathematics. A more convenient, but essentially equivalent theory is 
the formal system A^-IA. This is the subsystem of Formal Arithmetic, PA, in 
which the induction scheme is restricted to formulae. For each type theory or 
set theory, E that we will be interested in there will be a standard 77 ° sentence 
Con{E) of Formal Arithmetic, that naturally expresses the formal consistency 
of E. For formal systems, such as the set theories, that use a first order language, 
the system is understood to be consistent if there is no proof of a contradiction 
Af\^A. In the case of the type theories considered here, where there is an empty 
type, 0 , we will call the type theory consistent if there is no derivation in the 
type theory of a judgement of the form a : 0. for some a. Given two formal 
systems E\ and E^, Ex is defined to be proof theoretically reducible to E2 if 
the sentence Con{E2) — *■ Con{Ei) can be proved in I 7 °-IA. In this paper we 
generally obtain such a reduction via an explicit interpretation that allows any 
derivation of a theorem of E\ to determine a corresponding derivation in E2, 
in such a way that Con{E2) — > Con{Ei) easily follows. The interpretations can 
probably be used to give proof theoretic reductions in a stronger sense than 
used here.^ But we leave such strengthenings for others. We say that two formal 
systems are of the same proof theoretic strength if each is proof theoretically 
reducible to the other. 

1 The General Form of the Syntax and Set Theoretical 
Semantics of Our Type Theories 

1.1 Syntax 

We give the general form of the syntax of the type theories we will consider. 

The Pseudoterms. These are expressions, M, given by the following abstract 
syntax. 

M ::= a; I Co I ci(M) | C2{M,M) \ Cs{M,M,M) \ {Qx : M)M 
^ See [9] and [14] for more discussion of the concepts of proof theoretic reduction. 
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where x : VAR, cq : Co, ci : Ci, C 2 : C 2 , C 3 : C 3 and Q : QUANT. Here VAR 
is an infinite set of variables and the sets Ci, for i = 0, 1, 2, 3, and QUANT are 
finite sets of symbols that will depend on the type theory. 

Each Q operates as a variable binder so that free occurrences of x in M' 
get bound in (Qx : M)M' . The notions of free and bound occurrences of vari- 
ables and the substitution operation are defined in the standard way. We write 
M[Mi, . . . , Mnjx\, . . . Xn] for the result of simultaneously substituting Mi for Xi 
in M, for i = 1, . . .n, relabelling bound variables in the usual way so as to avoid 
variable clashes. For this we assume that the variables x\, . . . ,Xn are pairwise 
distinct. In general we will not distinguish between pseudoterms that only differ 
in a suitable relabelling of the bound variables. 

The Pseudo judgements and Formal Judgements of a Type Theory. 

Definition 1 A pseudojudgement has the form T B where T is a pseudo- 
context and B is a pseudobody, as defined below. 

— A pseudocontext is a finite sequence x\ : M\, . . . ,Xn : 0 / pseudo- 

declarations, Xi : Mi for i = 1, . . . ,n where each Mi is a pseudoterm and 
each Xi : VAR and, for 1 < j < i, Xi is distinct from xj and is not free 
in Mj. 

— A pseudobody has one of the following four forms. 



When the pseudocontext is the empty sequence then we get a pseudojudge- 
ment => B which will usually simply be written B. 

If T is a pseudocontext Xi : Mi, . . . ,Xn '■ Mn then a variable y is new to T if y 
is distinct from each Xi and not free in any Mi. 

Note: If F is a pseudocontext Xi : Mi, . . . ,Xn : Mn, the variable x is distinct 
from each Xi and M is a pseudoterm that has no free occurrences of any Xi 
then xi : Mi\M / x], . . . ,Xn '■ Mn[M/x\ is also a pseudocontext that we will 
abbreviate r[M/x\. Also we can define the result B[M/a:] of substituting M 
for a; in a pseudobody B in the obvious way. For example (Mi = M2) [M/a;] is 
defined to be Mi [M/a;] = M2 [M/a;] . 

The rules of inference of the type theories that we will consider will be given 
schematically and will have instances of the following form. 



where k > 0 and Ji - ■ ■ Jk are the premisses and J is the conclusion of the 
instance, both the premisses and the conclusion being pseudojudgements. When 
/c = 0, so that there are no premisses then the line above the conclusion will be 
omitted in writing the inference. 



M type, 
Mo : M, 



Ml = M2, 

Ml = M2 : M 



d\ ' ' * Jk 
J 
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The schemes presenting the rules will have the abbreviated form 

A ^ Bi • • • A =!> Bk 

A^B 

which is unabbreviated by making explicit an implicit pseudocontext metavari- 
able r of the scheme by adding it to the front of the left hand side of each 
premiss and the conclusion to get the scheme 

A A ^ Bi • • • r, A ^ Bk 
r,A^B 

Note that an unabbreviated scheme will generally involve metavariables and an 
instance of the scheme will be obtained by substituting for the metavariables, 
provided that the side conditions of the scheme hold. 

A pseudojudgement is a theorem and so a formal judgement of the type 
theory, if it is in the smallest class of pseudojudgements that includes the con- 
clusion whenever it includes the premisses of any instance of a rule of the type 
theory. Whenever a pseudocontext F appears in a formal judgement F B then 
we call F a context. 

All our type theories will have a common list of general rules of inference. 
These come under three headings, assumption rules, equality rules and substi- 
tution rules. 



General Rules 

Assumption Rules In these rules the variable x must be new to the implicit 
context F; i.e. not appear in F. Also, in the second assumption rule, the 
variable x must not be declared in A. 

A type A B A type 

X \ A ^ X \ A x:A, A=yB 

Equality Rules 

A type Ai = A2 Ai = A2 A2 = A3 

A = A A2 = A\ Ai = A3 



a : A fli = «2 : A oi = 02 : A 02 = 03 : A 

a = a : A 02 = oi : A oi = 03 : A 

a : A\ Ai = A2 oi = 02 : Ai Ai = A2 

a : A2 Oi = 02 : A2 

Substitution Rule 

X : A, A B a : A 
A[a/x] B[o/2;] 

Congruence Rules 



a; : A, A => C type oi = 02 : A x ■. A, A ^ c •. C oi = 02 : A 

A[oi/a;] C[oi/a;] = C[o2/a;] A[oi/a;] c[oi/x] = c[o2/a;] : C[ai/x] 
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1.2 Types-as-Sets 

We now assume given a fixed type theory T and a fixed set theory S. We will 
work informally in the set theory S. 

A types-as-sets interpretation (TS interpretation) of T in S is determined by 
the following set theoretic data. 

— For each cq, a set Cq 

~ For each c„, where n = 1,2,3, a definable n-place operation assigning a 
set , An) to each n-tuple Ai, . . . , A„ of sets. 

— For each Q, a definable operation Q*' that assigns to each set B that is a 
function a set Q\B). In practice, if A is a set and F is a definable unary 
operation on sets then, using the Replacement Axiom Scheme, that will be 
available in our set theory, we may form the set B = {(a, F(a)) | a G A} 
which is a function defined on A. The result of applying to this set B 
will be written {Q^a G A)F{a). 



The Interpretation Functions. By a variable assignment we mean a set 
theoretic function that assigns a set ^(x) to each variable x. We can define 
the interpretation function mapping each variable assignment ^ to the inter- 
pretation of M, for each pseudoterm M. The definition is by structural 

induction on the formation of the pseudoterm M , using the variable assignment 
when M is a variable and using the corresponding operation on sets, as illus- 
trated earlier, for each other form of expression. In the following n = 1, 2 or 3. 

[Nk = ?(^) 

[[co]]« = cE, 

[[c„(Mi, . . . , Mn)]k = cii[[Mi]k, • • ■ > [[Mn]k) 

[[{Qx : M)A/']]j = {Q^a G [[M]]^)[[M\^nM 

Here ^{a/x) is the variable assignment that is like ^ except that ^'( 2 ^) = 

The following lemmas are proved by a routine induction on the structure of 
the pseudoterm M. 

Lemma 2 If the variable x is not free in the pseudoterm M and are variable 
assignments that agree except possibly at x then 

Lemma 3 (Substitution Lemma) for all 

pseudoterms M , M' , all variables x and all variable assignments f. 

Definition 4 If B is a pseudocontext xi : Mi, . . . , x„ : Mn then let f \= B if 
^(Xi) e [[Mj]]j fori = l,...n. 

Lemma 5 If B is a pseudocontext x\ : Mi, . . . ,x„ : M„, x is a variable distinct 
from each Xi and M is a pseudoterm that has no free occurrences of any Xt then 
f ^ B[M/x] ^([[M]]j/x) 1= B for each variable assignment 
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Definition 6 We define C H ^ eaeh form of pseudobody B. 

- ^ 1= M type for any pseudoterm M , 

- f^Mi = M 2 if [[Mi]]j = [[M2H5, 

- ^f[[M]]^G[[M%, 

- ^ ^ Ml = M 2 : M' i/ [[Mi]]j = [[M2]]5 £ [[M%, 

Lemma 7 ^ ^ B[M/a;] ^([[M]]j/a;) ^ B. 

Definition 8 A pseudojudgement F => B is valid, written \= F B if , for all 
variable assignments f \= F implies ^ B. 



Definition 9 (Soundness) A rule of inference is sound if, for every instance 
— — -j — of the rule, if the premisses are valid then so is the conclusion; i.e. 

Ji & ■ ■ • & 1= Jfc] implies \= J . A type theory T is sound if each of its rules 
is sound. When we have a sound TS interpretation of a type theory T in a set 
theory S we will write T <ts S. 



The following result is by structural induction following the inductive definition 
of the formal judgements of a type theory. 

Lemma 10 If the type theory T is sound then every formal judgement of T is 
valid. 



Proposition 11 Each general rule is sound. Moreover, for each quantifier Q of 
the type theory the following congruence rule is sound. 

X : M => Ml = M 2 
{Qx : M)Mi = {Qx : M)M 2 

The proof of this result is straightforward. The assumption and equality rules 
are trivial. The substitution and congruence rules make use of previously stated 
lemmas. 



2 The Theory MLW®’'* 

We will start with the theory MLW. The abstract syntax of the theory is deter- 
mined by the following syntax equations. 

Co ::= 0 I 1 I 2 I * I 1 I 2, a ::= Rq \ ni \ tt2, 

C2 ::= Ri I pair \ sup \ app \ rec, C3 ::= R2, Q '.:= II \ E \W \ \. 

2.1 Some Defined Forms of Pseudoterm 

(Ml ^ M 2 ) = (IT. : Mi)M 2 (Ml x M 2 ) = {E_ : Mi)M 2 

{M 1 + M 2 ) = (Ex : 2)R2 (Mi,M2,x) N = (Wa; : 2)i?2(0, 1, x) 

Note that the underscore, in the first two definitions represents a vacuous 
variable; i.e. a variable that is being bound by II and E but does not occur 
in M 2 . 
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2.2 Special Rules for MLW 
Type Formation Rules 

ctype (c€{ 0 ,l, 2 }) 



Ai type A2 type c : 2 
R2{Ai,A2,c) type 



X ■. A=> B type 
{Qx : A)B type 



{Q€{n,s,w}) 



Using the definitions above we have the following derived type formation 
rules. 



N type 

Introduction Rules 



Ai type A2 type 
{A14A2) type 



(# S +}) 



* : 1 1:2 2:2 



X ■. A=> h\ B 
{Xx : A)b : {Ux : A)B 



X : A^ B type a : A b : B[a/x] 
pair{a,b) : {Ex : A)B 



X : A^ B type a : A f : {B[a/x\ — > {Wx : A)B) 
sup{a, f) : {Wx : A)B 

Special Congruence Rules 

X : A ^ Bi = B2 



{Qx : A)Bi = {Qx : A)B2 

X ■. A ^ bi = b2 '■ B 



{Q G {n, U, IF}) 



(Aa; : A)bi = {Xx : A)b2 ■ {Bx : A)B 

Elimination Rules 

X : 0 ^ C type a : 0 a: : 1 C type a : 1 c : C[*/a;] 

Ro{a) : C[a/x] i?i(c, a) : C[a/a;] 

X :2 ^ C type a : 2 ci : C[l/a;] C2 : C[ 2 /a;] 

R2(ci, C 2, a) : C[a/a;] 

X : A^ B type / : {Ux : A)B a : A 
app{f,a) : B[a/x] 



X : A^ B type c : {Ex : A)B 

J 7 Ti(c) : A 

\ 7T2(c) : B[ki{c)/x\ 



( X : A^ B type z : W ^ C type 
\b:{nx: A){nu : B W)D{x, u) e : IF 
rec{b, e) : C[e/ z] 
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In the last rule we used W to abbreviate {Wx : A)B and D{x,u) to abbre- 
viate {Uy : B)C[app{u,y)/ z] — > C[sup{x,u)/ z]. 

Computation Rules 



X : 2 ^ C type ci : C[l/a;] C 2 : C[2/a;] 

J i? 2 (ci,C 2 ,l) = Cl : C[l/a;] 

\i?2(ci,C2,2) = C2 :C[2/x] 

X : b : B a : A 

app{{\x : A)b,a) = b[a/x] : B[a/x] 

X : A^ B type a : A b : B[a/x] 

J 7Ti(pair(a, b)) = a : A 
[ 7T2(pair(a, b)) = b ■. B[a / x\ 

( x ■. A => B type z -.W ^ C type 

\b:{nx: A)\nu : B W)D{x,u) a: A f : B[a/x] W 
rec(b,sup(a,f)) = app(app(app(b, a), f), g) : C[sup(a, f)/z] 

In this last rule we used the following abbreviations. 



Ai type A 2 type 



x ■. 1 ^ C type c : C[*/x] 
i?i(c, *) = c : C[*/x\ 




W for {Wx : A)B, 



D{x,u) for {By : B)C[app{u,y) / z] C[sup{x,u) / z] 
g for {Xy : B[a/x])rec{b,app{f,y)). 



2.3 Extending to 



We first extend the syntax equations using C2 ::=••• | EQ and C3 | Eg. 

We add the rules of inference given by the following schemes in abbreviated 
form. 
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2.4 The TS Interpretation of in ZFC 

We will work informally in the set theory ZFC. We use the usual von Neumann 
definition of the natural numbers; i.e. 0 = 0, 1 = {0}, 2 = {0, 1}, etc .... Ordered 
pairs are defined as usual; i.e. for sets a, b we define (a, b) = {{a}, {a, 6}}. As usual 
functions are single valued sets of ordered pairs. For any set b, its domain and 
range are the sets dom{b) = {x\3y {x, y) € b} and ran{b) = {y \3x (x, y) S b}. 

If a is a set and i? is a definable operation that assigns a set B{x) to each x G a 
then we let Ux^aB{x) be the set of all the functions /, with domain a, such that 
f{x) G B{x) for all x G a. Also, we let Sx^aB{x) be the set of all pairs (x,y) 
such that X G a and y G B{x). 

A function coding in set theory consists of a pair of definable operations 
APPjLAM on sets, APP being binary and LAM being unary, such that if / is 
a function and a G dom{f) then 

APP(LAM(/),a) = /(a). 

The standard example of a function coding is given by the definitions 

APP(a, 6) = {a; G U U Ua I 3y[x G y & (t>, y) G a]}, 

LAM(a) = a 

for all sets a, b. Later it will be convenient to use a non-standard function coding. 
In the following we assume given some function coding. Given sets a, b, c, d let 

EXP(a, b) = {LAM(/) \ f : a ^ b} 

FlxeaB{x) = {LAM(/) I / G UxeaB{x)} if B{x) is a set for each x G a 
APP 2 (a, 6, c) = APP(APP(a, 6), c) 

APP 3 (a, 6, c, d) = APP(APP(APP(a, 6), c), d) 

We now present the set theoretic interpretations of the syntactic operations 
of ML®^*, leaving the interpretations for the W rules til later. 

0^ = 0, = 1, 2^ = 2, = 0, = 0, 2^ = 1 

i?^(a) = a, 7r5(a) = {x\3y (x,y) = a}, 7r^(a) = {y\3x (x,y) = a} 

R\{a, b) = a, pair'^{a, b) = (a, b), app\a, b) = APP(a, b) 

R^ia, b,c) = {x\ (c = & a; G a) V (c = 2^ & a; G 6)} 

EQ^ {a,b) = {a; | a; = 0 & a = 6}, (a, 6, c) = {a; | a; = 0 & 6 = c & 6 G a} 

If 6 is a function with domain a let 

(6) = LAM(6) 7T^ (6) = PI,e„6(x) (6) = A,e,6(x) 

To deal with the W rules we will need the following result. 
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Theorem 12 

1. For each set b there is a smallest set W such that if {x,y) € b and 
f G EXP(y, W) then (x, f) G W. We write W{b) for this set W. 

2. Given a set g letY{g) = T'^edom(g)^«edom(APP(s.x))C^OTO(APP 2 (g, a;, u)) and 
if (x, (m, t;)) G Y (g) then let = {(APP(u, y), APP(ri, y)) \ y G dom(u)}. 
There is a smallest set f such that if (x, (u,v)) G Y{g) and Xu,v Q f, then 
((x, ti), APP 3 (g, X, u, x)) G f. We write TZ{g) for this set f. 

3. Let a, b, c be sets such that b, c are functions with dom{b) = a and dom{c) = 
W{b) and let W = W{b). Let g G PlxeaPheEXP(b(x),w)d((x, u)) where 
d{w) = EXP(PIyg;,( 2 ;)c(APP(u, y)), c(w)) for w = (x,u) G W. Then TZ{g) is 
the unique function f G TIyj^y/c{w) such that if w = (x,tt) G W then 

f{w) = APP 3 ( 5 ,x,u,LAM(il(/,M)). 

Here H{f,u) is the function h G ny^t>(x)c(APP(u, y)) such that 
Hy) = /(APP(m, y)) for y G b{x). 



Proof of the Theorem in ZFC. 

The first two parts of this theorem are applications of the following 
result. 

Lemma 13 Let 0 he a definable operation on sets such that, for some 
set B, whenever X is a set such that 0{X) has an element then there is 
a surjective function f : b ^ X for some b G B. Then there is a smallest 
class L such that 

X Cl ^ 0{X) C /. 

Moreover L is a set. 

To prove part 1 of the theorem, using this lemma, it suffices to let 
0(A) = IJ {(x,LAM(/)) I / : y ^ A is onto A}, 

{x,y)eb 

and choose B = ran{b). For part 2 we let 

0(A) = {((x, It), APP 3 (y,x, «,-(;)) | (x, (u,v)) GY{g) kX = Xu,v}, 

and choose B = {A„_„ | (x,{u,v)) G Y(g)}. For part 3 of the theorem, 
first observe that, by an easy induction following the inductive definition 
of 7Z(g), dom{TZ{g)) C W. Now, by another easy induction, this time on 
the inductive definition of W, observe that, for each w = (x,u) G W, 
AYP'i{g,x,u,LAM{H{f,u))) is the unique ^ such that {w,z) G 'R.(g) 
and moreover z G c(w). All this shows that 'R-{g) is an / satisfying the 
desired conditions. Finally, another proof by induction on W will show 
that TZ(g) is the unique / satisfying these conditions. 
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We now turn to the proof of the lemma. Let F be the operation on 
sets given by 

F{Y) = IJ 0(X), 

XePow(Y) 

for each set Y. The operation F is monotone and we must show that it 
has a least fixed point. By transfinite recursion on ordinals we can define 
sets for ordinals a, so that /“ = F{I^°‘), where = U/ 3 <a-^^- 
Let K be an infinite regular ordinal such that card{b) < k for all b G B. 

We now claim that C To see this, let a G Then a G 6>(X) 
for some set X C /<”. For each x G X let h(x) be the least ordinal j < k 
such that X G P ■ By the assumption on O there is b G B and a function 
f : b ^ X that is onto X. If a = card{b) then a < k and there is a 
function g : a ^ b that is onto b. It follows that h o f o g : a k. As k 
is regular there is /3 < k such that hofog: a^p. As fog is onto X 
it follows that h : X P so that X C and hence a G C 

It is a standard consequence of this claim that P is the least fixed 
point of F and so is the desired set I of the lemma.® 

To interpret the extra syntax needed for the W rules we use sup\a,b) = (a,b), 
rec\a, b) = TZ{a){b) and if 6 is a function we use W\b) = yV{b). 

Theorem 14 (ZFC) The type theory MLW®*** is sound. 

This result gives a proof theoretic reduction of the type theory to the 

set theory ZFC. We write <ts ZFC to express this reduction. The type 

theory is constructive in the sense that when the propositions-as-types idea is 
used to represent logic then intuitionistic logic is represented and the law of 
excluded middle is not justified. On the other hand the set theory is classical. In 
the following two subsections we improve on the result by first making the type 
theory classical and second by making the set theory constructive. 



2.5 Adding Excluded Middle 

Recall that the logical notions are represented in M LW by using the propositions- 
as-types idea. In particular the operation + on types represents disjunction and 
negation is represented by the operation that maps a type A to the type A — > 0. 
So to add the law EM of excluded middle to the type theory we extend the 
syntax ci ::=■■■ \ cl and add the following rule. 

A type 

d{A) : A-F(A^O)- 
We call the resulting theory MLW -F EM. 

® This proof of the lemma uses the classical theory of cardinal numbers and uses AC. 
I do not think that AC can be avoided. Instead of AC it may be possible to use the 
axiom that there are unboundedly many regular ordinals. 
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We need to extend the TS interpretation by having an equation for the new 
form of pseudoterm. To do so we strengthen the axiom system ZFC by adding a 
one-place function symbol CH to the language of ZFC and adding the following 
global form of the axiom of choice. 



The axiom schemes of ZFC should be extended to the extended language. We 
call the resulting axiom system ZFGC. Working in this axiom system we can 
define an operation CL where, for each set a, 



We can now let = CL. It is easy to check that C |= G ^ ^ 0)] 

for each pseudoterm A and each variable assignment So we get the result that 
MLW®xt-F EM <Ts ZFGC. 

2.6 Reduction to a Constructive Set Theory 

We now follow the other strategy to improve on the result <ts ZFC. This 

is to weaken ZFC to a constructive set theory. In [I] a constructive set theory 
CZF was introduced that is a subtheory of ZF whose logic is intuitionistic. This 
set theory was shown to have the property that when excluded middle is added 
to the logic then a theory CZF + EM is obtained that has the same theorems as 
ZF. Here we will consider the extension CZF^ = CZF + REA of CZF obtained 
by adding to CZF the following axiom, that was first introduced in [3]. First we 
define a transitive set H to be a regular set if, for every a G A and every set 
R C ax A such that Va; € a3y G A[{x,y) G i?] there is a set 6 S H such that 
Va; e o3y G 6[(a;, y) G i?] and Vy € b3x G a[(a;, y) G R\. 

Regular Extension Axiom (REA) Every set is a subset of a regular set. 

The construction, in subsection 2.4, of the TS interpretation of MLW®^* was car- 
ried out in the set theory ZFC. It is straightforward to show that the construction 
can be carried through in CZF^. In fact it can all be carried through in CZF, 
except for the proof of Lemma 13 The proof in ZFC that was given here of that 
lemma used the power set axiom and some of the classical theory of cardinal 
numbers and needed the axiom of choice. Instead we can apply Theorem 5.2 
of [3] to see that the lemma is provable in CZF^.® So we now have the following 
result. 

Theorem 15 (CZF^) The type theory MLW®*^* is sound. 

This can be expressed as MLW®’^* <ts CZF^. 

® The status of CZF^ -I- EM = ZF -I- REA is unclear. Every theorem is a theorem of 
ZFC. But it is probable that REA is not provable in ZF. 



\/x[x yf 0 ^ CH{x) G x\. 
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3 Adding Type Universes 

In this section we consider natural ways of extending the type theory MLW 
with one or more type universes; i.e. types of types. In each case we define a 
corresponding way of extending set theory so that the TS interpretation extends 
to include the type universes. 



3.1 Adding a Single Reflecting Type Universe, U 

We extend the type theory MLW to MLWU by adding a type U of types that 
has rules that reflect the type forming rules of MLW. First we extend the syntax 
with Co ::= • • • | U. Next we add the rules given by the following schemes in 
abbreviated form. 

^ A type ^ (c e {0, 1, 2}) 



A:U 

{Qx : A)B : U 



(g G {7T, A, IF}) 



When extending MLW®*** to MLW®’**U we also need rules for U to reflect Eq 
and EQ] i.e. 



A : U ai ■. A Q2 ■ A 
Eq{A, oi, 02 ) : U 



Ai : U A2 : U 
EQ{A,,A2) : U 



In order to extend the TS interpretation to MLW®’'*U + EM it suffices to add to 
ZFGC the axiom that there is a strongly inaccessible cardinal and interpret U as 
the set of all sets of set theoretic rank less than the least strongly inaccessible 
cardinal. If we call the resulting set theory ZFGCi then we get the reduction 
MLW®’**U + EM <TS ZFGGi. To extend the TS interpretation of MLW®’'* in EZE"*" 
we add to GZF’’’ an individual constant u and axioms expressing that u is an 
inaccessible set in the sense of [11]^*^. We write GZF^u for the resulting theory. 
Now it suffices to take = u and we get the reduction MLW®’'*U <ts GZF^u. 



3.2 Adding an Infinite Hierarchy, Do, Ui, . . ., of Reflecting Type 
Universes 

This time we extend the syntax using cq ::= • • • | Un (n = 0, 1, . . .) and add 
rules given by the following schemes for n = 0 , 1, . . .. 



U„ type 



A: U. 



A type 

A : U X \ A ^ B : Ur 
{Qx : A)B : U„ 



c:U„ (cG{0, 1, 2}) 



(g € {7T, r, IF}) 



i.e. a regular set that is a transitive model of CZF^. 
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Uji ■ Un-\-\ 



A:Un 



In the case of MLW®*** we also need the obvious rules for reflecting Eq and EQ. 
We get the resulting type theories MLWU<ij and MLW®^*U<tj. To extend the 
TS interpretation we need to extend the classical and intuitionistic set theories 
in the following way. We add an infinite sequence u„ for n = 0,1,... of indi- 
vidual constants to the set theoretical language and add axioms u„ € u„_|_i for 
n = 0, 1, . . .. In the classical case we also add axioms that express that each u„ 
is the set of sets of rank less than a strongly inaccessible cardinal number and in 
the constructive case we add axioms that express that each u„ is an inaccessible 
set. We write ZFGCu<,^ and CZF^u<(^ for the resulting extensions. We extend 
the TS interpretation by taking Ujj = u„ for each n and get the reductions 
-FilM <Ts ZFGCu«^ and <ts CZF+u«^. 

3.3 Adding an Impredicatively 7T-closed Type Universe P 

We extend the syntax with cq ::= ■ ■ ■ \ P and add rules given by the schemes 



With these rules the type P behaves like the impredicative type of proposi- 
tions of the calculus of constructions, with the additional properties that 0 : P 
and all the propositions in P are proof-irrelevant. Adding these rules we get 
the type theories MLWP and MLW®’**P. To get the type theories MLWPU and 
MLW®’**PU we need to add the previously given rules for U and also the following 
rules so that U reflects P. 



Similarly we can define the type theories MLWPU<(^ and MLW®’'*PU<aj. 

We show how to extend the TS interpretation so as to interpret the type P and 
justify its rules. In classical set theory we can interpret P as the set 2 = {0,1}. 
But to do so we need to use a non-standard function coding. Recall that our 
TS interpretation uses an arbitrary function coding and so far the standard 
one has been good enough. But to justify the rules for P we use the following 
non-standard function coding. 




A : P ai : A Q2 ■ A 
ai = U2 ■ A 



X ■. A^ B : P X : A ^ Bi = B 2 '■ P 



{Ex : A)B : P {Bx : A)Br = {Ex : A)B 2 : P 



APP(a,6) = {y \ (b,y) G a}, 
LAM(a) = U(;r.z)Ga({4 X z)- 



The advantage of this function coding over the standard one is that we can 
prove the following result, which we express in a form that still usefully holds in 
constructive set theory. Recall that 1 = {0}. 
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Proposition 16 For any set a, if B{x) C 1 for each x G a then 
Pl,eaB{x) = {y € 1 I Vx € a{B{x) = 1)} C 1 
so that PlxeaB{x) = 1 \/x € a{B(x) = 1). 

Note that in classical set theory the subsets of 1 are just the elements of 
2 = {0, 1}. In constructive set theory the subsets of 1 play the role of the small 
extensional propositions and the above result expresses that the PI operation 
behaves like universal quantification on such propositions. 

Using this result we get the soundness of the rules for P and hence the 
reductions MLW®>'»P + EM <ts ZFGC, MLW">'^PU + EM <ts ZFGCi and 
MLW®’**PU<(^ + EM <TS ZFGCu<^. In constructive set theory we cannot use 
Pow{l) = {a; I a; C 1} to interpret the type P as the class Pow{l) cannot be 
shown to be a set in GZF or its constructive extensions. Instead we will here 
simply extend the theory to give us what we want. So we add a new individual 
constant p to the language and add the following axioms. 

1. 0 e p, 

2. Va; e p a; C 1, 

3. If B is a function with domain the set a such that Vx G a B{x) G p then 
Pl^eaB{x) G p. 

This gives us the extension CZF+p. For the theories CZF+pu, CZF+pu^^ we also 
need the axioms p G u, p G uq respectively. 

Of course in the TS interpretations in our constructive set theories we let 
P^ = p and get the reductions: MLW®’**P <ts CZF+p, MLW®’'’^PU <ts CZF+pu 
and MLW-‘PU«, <ts GZF+pu<^. 

4 Interpreting Set Theories in Type Theories 

We now explore to what extent the proof theoretic reductions we have obtained 
using the TS interpretation can be reversed using what we will here call the ST 
interpretation. This is the sets-as-trees interpretation that was introduced and 
developed in [1,2,3] and has also been used in [10,11]. It is used to interpret a set 
theory in a type theory. The idea for the original interpretation, in [1], of CZF 
in MLWU was to interpret the sets of CZF as the well-founded trees of the type 
V = {Wx : U)a;, the membership and equality relations of CZF being interpreted 
as terms ey, =v of type U ^ (U ^ U). Using the propositions-as-types idea each 
sentence of CZF was interpreted as a type of MLWU and it was shown that each 
theorem of CZF is an inhabited type of MLWU; i.e. a type A such that a : A can be 
derived in MLWU for some term a. In this way a proof theoretic reduction of CZF 
to MLWU is obtained that will be expressed as CZF <st MLWU. In fact, as 

Notice that the ST interpretation does not use any kind of equality types, neither 
intensional nor extensional, so that we have stated the stronger result of a reduction 
to MLWU rather than to MLW'^’'’^U. 
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shown in [3], we get CZF^ <st MLWU. Also, it is easy to see that, using the rule 
EM of MLWU + EM we can justify both the law of excluded middle and global 
choice for the set theory so as to get the reduction ZFGC <st MLWU + EM. 
Unfortunately this and the previous reduction do not match up exactly with our 
earlier TS reductions. The trouble is the need to use a type universe U in our 
ST interpretation. In order to interpret the type universe in set theory we need 
to strengthen the set theory with a set theoretic version; i.e. an inaccessible set 
in the constructive set theory case and a strongly inaccessible cardinal in the 
classical set theory case. Now, if we wish to extend the ST interpretation of CZF^ 
to an interpretation of CZF^u, we need to use two of the type universes Uq, Ui 
of MLWU<tj and their rules and use the type Vl = {W x : Ui)a; to interpret the 
universe of sets of CZF^u. The inaccessible set u of CZF^u can be modelled by 
Vq = sup{Vo, (Ax : Vo)h(x)) : Vl where Vg = (Wx S Uo)a; : Ui and h(x) : Vi is 
defined by transfinite recursion on x : Vb so that 



h(sup{a,f)) = sup(Vo, (Ax : a)h(app{f, x))) 



for a : Uq and f : a Vb; i.e. h(x) is the term rec(b,x) where b is the term 
(Ax : Uo)(Ay : x — > Vb)(Az : x ^ Vi)sitp(x, z). 

We can extend these ideas to more universes, a set theory with n inaccessibles 
being given an ST interpretation in a type theory with n + 1 type universes, 
Uq, ■ ■ ■ , U„, with the universe of sets of the set theory being interpreted as the 
type Vn = (Wx : U„)x. 

Fortunately we do get a matching of a set theory with a type theory of the 
same proof theoretic strength when we go to the limit. First consider the type 
theory MLWU<i^U that is obtained from MLWU<(^ by adding the type universe U 
at the top reflecting all the rules of MLWU<(^ so that in particular we have the 
rules 



for n = 0, 1, .... As above we get an ST interpretation of CZF+ into this theory, 
using V = (Wx G U)x to interpret the universe of sets of the set theory, giving 
us CZF^ <ST MLWU<(^U. Now observe that we have a proof theoretic reduction 
MLWU <(^U < MLWU<(^. The idea for this is that any derivation in the left hand 
type theory can only involve finitely many of the type universes U^ and so can 
be translated into a derivation in the right hand type theory by replacing the 
symbol U everywhere by U„, where n is chosen large enough so that n > i 
whenever Ui occurs in the derivation. Using a previous TS reduction, we get the 
next result. 



Theorem 17 The following theories are of the same proof theoretic strength: 
CZF+u<^, MLWU<^U, MLWU<^, MLW^^'^U^^. 

We have the same situation for classical set theory so that, using the fact that 
global choice does not increase the proof theoretic strength, we get the next 
result. 
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Theorem 18 The following theories are of the same proof theoretic strength: 

ZFCu«^, ZFGCu<^, MLWU«^U + EM, MLWU<,, + EM, + EM. 

Finally we observe that the ST interpretation carries over to the set theory 

CZF+p to give the reduction CZF+p <st MLWUP and, as above, the reduc- 
tion CZF+pu^,^ <ST MLWPU<aj. This, with a previous reduction gives us the 

following result. 

Theorem 19 The following theories are of the same proof theoretic strength: 

CZF+pu<,,, MLWPU<^, MLW-tpu<^. 
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Abstract In this paper we present a simple model of communication. 
We assume that communication takes place between two agents. Each 
agent has a private and subjective knowledge state. The knowledge of 
both agents is partial, finite, and represented in a computational way. We 
investigate how ideas can be transferred from one agent to the other one, 
in spite of the subjective nature of the knowledge of both participants. 
Posing the problem in this way, it can be seen that mechanisms for 
context-dependent interpretation are a prerequisite for succesfull com- 
munication. 



1 Introduction 



Language solves a problem. It helps people to exchange ideas, even if these people 
come from different backgrounds, know different concepts and individuals, and 
have wildly diverging views in many different matters. Ideas that are privately 
known to one agent, are transformed into a public message and subsequently 
decoded by another agent, who interpretes this message, and reacts on it. We 
model this proces, starting from subjective knowledge states, and show how 
content which is meaningful in the subjective knowledge state of one agent can 
be transferred to the subjective knowledge state of another agent by means of a 
common language. 

Throughout this paper we concentrate on the simple case where two agents 
communicate about their common (physical) environment. Specifically, we use 
examples in which two agents discuss an electron-microscope, a situation taken 
from the ‘DenK-project’ ([2]). In this project we constructed a man-machine 
interface based on the approach to communication sketched in this paper. 
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2 Formalising Knowledge States 

In this section, we show how an agent’s subjective knowledge state can be for- 
malized by means of type theoretical contexts. First, however, we explain what 
we mean by knowledge. 



2.1 Knowledge 

Each person understands the world in terms of his own concepts. Which con- 
cepts a person has formed at a given moment in time is obviously dependend 
on many factors, like his physical and cultural environment, personal history, 
etc. Although we are not concerned with the formation of concepts here, we 
assume that concepts pertaining to an agents physical environment are some- 
how inspired by his sense impressions: a human interacting with his environment 
experiences these impressions as an organised whole, in which various familiar 
phenomena interact in more or less predictable ways. These correlations between 
sense impressions somehow allow concepts to be formed that are subsequently 
used to ‘understand’ the diverse experiences from which they have arisen. Some 
of these concepts are ‘inhabitable’, i.e. they may have instances. The person 
which is familiar with a specific inhabitable concept will recognise an instance 
of this concept, whenever he runs into it^. In this way he is able to connect his 
raw experience with the subjective concepts that he uses to classify it. As a con- 
sequence, each agent has its own concepts, often similar to, but not necessarily 
identical with, those of his fellow-agents. 

An agent’s conscious knowledge^ about the world will be formulated entirely 
in terms of the concepts that he recognises. This knowledge is not static, but 
can grow as a result of communication, observation and inference processes. The 
resulting body of knowledge, the knowledge state, will not be a bare set of facts, 
but a structured conglomerate of justified beliefs, where each new item must 
be embedded in the knowledge which is already present. Thus, this body of 
knowledge is: 

Subjective: It is formulated in terms of personal concepts, it will be partial, 
and it may even be incorrect. 

Incremental: The ways in which this body of knowledge can be extended 
depend on what is already present. 

Justified: Knowledge is not a collection of bare facts, but will be justified in 
terms of other, more basic knowledge. 

^ How this happens is irrelevant here, an obvious possibility is through neural net- 
works. 

^ Here we take knowledge in the everyday sense of the word. We quote the definition 
in Webster’s new dictionary of synonyms (p. 481): ‘Knowledge applies not only to a 
body of facts gathered by study, investigation, or experience but also to a body of 
ideas acquired by inference from such facts or accepted on good grounds as truths.’ 
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2.2 Knowledge States as Contexts 



How can these ‘subjective’, ‘incremental’, and ‘justified’ knowledge states be for- 
malised? Fortunately, a similar problem arises in mathematics. The main concern 
of mathematicians is to show which consequences follow from certain assump- 
tions. On the one hand this activity is virtually unconstrained: new concepts may 
be developed, and assumptions can be freely made, independent from external 
reality. On the other hand the mathematician has to adhere to a strong kind of 
mental hygiene: concepts can only be formed if they fit into existing categories, 
assumptions can only be made if they are meaningful in the context of that 
which is already given, and all conclusions have to be thoroughly justified. Pure 
Type Systems (PTSs, [3]) are typed-lambda calculi that can be used to record 
such mathematical activity in a formal and machine-readable format. In this 
paper, we assume that a person that tries to understand the outside world is, 
in many respects, comparable to a mathematician. However, the concepts that 
this person develops will be inspired by his sense data, and the assumptions 
that he makes are assumptions about the outside world, and are (one hopes) 
supported by what he sees. In other words, the ‘body of hypotheses’ that this 
person develops will be grounded in the external world. 

A PTS-context F can represent the ‘body of hypotheses’ of a mathematician. 
We propose to use such a context to represent the knowledge state of an agent. 
To reflect the assumptions of an agent about the real world, this context has 
to be partially grounded in its sense-impressions. This can be achieved if some 
inhabitable terms T in the context have an observational interpretation. This 
interpretation is the personal ability of the agent to judge whether something 
which is perceived is an inhabitant of T or not. A typical PTS may have the 
sorts *s and *p, where the sort *p corresponds to the type containing all possible 
propositions, and the sort *s to the type containing all possible categories of 
objects. For all types T for which the agent has an interpretation, either 
TFT :*sOrThr :*p. 

Combining sense-impressions and interpretations, the agent takes certain 
types to be inhabited. This is expressed by judgements that contain atomic jus- 
tifications, i.e. justifications which do not admit analysis. These correspond to 
perceived objects, or direct physical evidence for a certain proposition. Though 
an agent is only able to recognize types, he can nevertheless deal with individual 
objects: if the agent knows a certain type T to have exactly one inhabitant^, 
it will interpret all terms in T as denoting the same individual. In cases where 
such a type T has an observational interpretation, the agent is able to recognise 
this individual in the outside world. 



® Technically, this might be expressed by an axiom stating that all inhabitants of T 
are Leibniz-identical 
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2.3 Growth of Knowledge 

An agent’s knowledge can grow. In our formalization, this knowledge growth 
corresponds to an extension of the context representing the agent’s knowledge 
state. 

Reasoning is one possible mechanism for knowledge growth. The reasoning of an 
agent is modelled by the construction of new statements out of those occuring 
in the context representing his knowledge state. Derivability on this context 
{r h E : T) reflects the agent’s ability to And rational evidence {E) for an 
assertion (T) in his current knowledge state. We assume that the derivation rules 
are the same for all agents; knowledge states differ in content, but all agents ‘use 
the same logic’. 

After deriving a new statement {E : T), the knowledge state (E) should be 
updated by somehow appending this statement. To allow this kind of record- 
ing of conclusions, De Bruijn ([5]) proposed to enrich the notion of context 
with ‘definitions’. Using this idea, the context can be extended with a definition 
X = E : whenever a judgement F \- E : T is derived in which if is a complex 

expression. This definition expresses that the variable x of type T may be used to 
refer to the complex term E in further derivations: F, x = E : T \- x : T . In other 
words, the definition ‘abbreviates’ the complex term with a fresh variable. At any 
point in time a definition can be ‘unfolded’ again, replacing the abbreviation (x) 
with the complex term E. In the presence of definitions, a well-formed context 
will look like this: x\ : T\,X2 '■ T2,X3 = E^ : T^^xa : T4, . . . : Tn- It 

represents a structured collection of assumptions (atomic justifications), inter- 
mingled with conclusions (complex justifications) that have been drawn on the 
basis of these assumptions. 

At first sight, recording the results of reasoning in the knowledge state may 
seem to be a mere ergonomical device: although the definition saves the trouble 
of going through the derivation of E again, the statement E : T can be recon- 
structed on any context F' containing F {F C F'). However, in practice agents 
have limited deductive powers, allowing them only to oversee the more or less 
obvious consequences of their knowledge; conclusions which can be derived with 
a reasonable amount deductive work. For such agents storing conclusions in the 
knowledge state literally broadens their horizon, by bringing consequences into 
view that were unreachable before. 

Whereas the reasoning process extends an agent’s knowledge state from within, 
knowledge can also be obtained from external sources through communication 
and observation. This kind of information is represented by a pseudo-context 
yi : Ti,...,j/m : ?m, where are fresh variables. In general, if one 

appends a pseudo-context Z\ to a well-formed context F, the result (T, A) is not 
a we 11- formed context. In order to ensure the well-formedness of the result, we 
require that A is an extending segment of F : 



This is shorthand for: x = E and E : T which together imply that x : T 
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Definition 1. A pseudo- context A is an extending segment of a well-formed 
context r iff r, A is a well-formed context. 

On the one hand this is just a technical requirement. On the other hand, given 
that types represent concepts in the agent’s knowledge state, it captures the 
intuition that an agent can only extend its knowledge state with information 
that is meaningful to it, i.e. expressed in terms of familiar concepts. 

2.4 Common versus Private Knowledge 

The backbone of a communication process between (two) agents is the continous 
extension of their common knowledge. To model this process adequately, we need 
to distinguish within the knowledge state of each agent that part of its knowledge 
which it assumes to be shared. So for each agent p we have a context Fp which 
contains all of its knowledge, within which we can distinguish a (sub)context Wp, 
with Fp C Fp, which contains all knowledge that, according to p, is shared. 

Both Fp and Fp are well- formed type theoretical contexts in their own right. 
The common context Fp is ‘a part of’ the private context Fp, and this relation 
can be defined in a straightforward way: 

Definition 2. Given two legal contexts F and F' , F is a part of F' , notation 

r c r, tjf 

1 for all statements of the form x : T occurring in F either: 

X : T or a definition x = E : T occurs in F' , 

2 all statements of the form x = E : T occurring in F occur also in F' . 

Under this ‘part of ’-relation, every definition in F must occur in F' (2), but 
declarations in F may be replaced by definitions in T' {!). This will be of use in 
Sect. 5, where it allows us to ‘link’ shared information in Fp to private informa- 
tion in Fp. 

The distinction between common and private knowledge gives rise to a some- 
what more fine-grained account of reasoning. Since both Fp and Fp are legal con- 
texts, new statements can be constructed on either context using the derivation 
rules. Hence we can model the agent reaching ‘private conclusions’ in reason- 
ing with private information (Fp) and ‘common conclusions’ in reasoning with 
common information (Fp). Information that is shared with another agent is also 
privately available, as reflected in the inclusion Fp C Fp. This inclusion guar- 
antees that any statement derivable on an agent’s common context (Fp) is also 
derivable on his private context (Fp), but not the other way around. 

3 Communicable Content 

The previous section shows how the knowledges states of communicating agents 
are modelled by means of type theoretical contexts. In this section, we extend 
the model with a formal account of communicable content: first the relation 
between the subjective knowledge states and the common language in which the 
agents communicate is discussed, then we characterize communicable content. 
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3.1 Concepts and the Common Language 

One the one hand, each agent has its own knowledge state, built on concepts 
which are meaningful only to himself. On the other hand, the agents speak 
a common language in which they communicate. Hence each agent somehow 
connects its subjective concepts to words in this language. For instance, each of 
the agents will recognise a certain class of objects that are used by people to sit 
on. There is a word to describe this class in the language; in English objects in 
this class are called ‘chairs’. There may be certain differences in interpretation, 
i.e. one agent may recognise an object as a chair that the other would call 
otherwise, but on the whole the two categories will match quite well. This is the 
case, because the use of all words is constantly being gauged by the language 
community. 

If a word in a given language corresponds to a concept, this concept is nec- 
cessarily common to a rather large group, i.e. the speakers of the language in 
question. The specific individual objects that we encounter in our daily life, such 
as ‘my chair’, are not commonly known among all speakers. Accordingly there 
exist no words that directly refer to these objects. This means we have to refer to 
these objects as instances of a certain class, and try to point out the particular 
object through a description of characteristic properties that are accessible to 
the dialogue partner. 

For the purposes of this paper it is not neccesary to elaborate the mapping 
between language and Type Theory. We simply assume that for each agent (p) 
there exists a partial mapping Tp W between type variables in its knowledge 
state and words in the vocabulary of the shared language. How this mapping 
was formed (when the language was learned) is also outside the scope of our 
model. Though the mapping between the knowledge states and the language in 
our model is rather crude it still reflects the fact that words must necessarily 
refer to general concepts, which are meaningful to the language community as 
a whole: the mapping does not extend to the level of particular individuals and 
proofs, i.e. the inhabitants of inhabitants of *p or *g. 

3.2 Messages 

Against the background of our unsophisticated account of the relation between 
the knowledge states and the language spoken by the agents, we wish to un- 
derstand how information can be exchanged between private type theoretical 
knowledge states, using expressions in some public language. These expressions, 
which we call ‘messages’, will somehow have to be meaningfully related to the 
knowledge states of both agents; they express content that an agent can commu- 
nicate to his dialogue partner. 

We assume that there are two agents, A and B, that both have a subjective 
knowledge state. If communication is to be possible, they must share a common 
vocabulary W. To communicate the speaker (A) must encode a segment A a 
which is meaningful within its own knowledge state into a public message, using 
this vocabulary. This message is sent to the hearer, {B) which subsequently 
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decodes it. If the communication is to be succesful, the result of decoding must 
be meaningful to B, i.e. it must be an extending segment of B’s context. 

How can segments be encoded and decoded? Obviously, both encoding and 
decoding have to be based on the common vocabulary. The agent A uses the 
mapping Ta W to encode a segment A a- Given the mapping , it simply 
substitutes in the segment Aa the words of the vocabulary for the types they 
are related to. However, not every meaningful segment Aa can be encoded suc- 
cesfully. Obviously, we must ensure that the result of encoding, which is to be a 
public message, does not contain any privately bound variables. Segments that 
meet this requirement we call codeable. 

All this is expressed formally in the following definitions: 

Definition 3. A variable z occurs free in a segment A, A = Xi : Ti, ... ,Xn '■ T^, 
iff z occurs free inff (1 < i < n) and there is no statement Xj : Tj with 1 < j < i 
such that z = Xj. 

Definition 4. A segment Aa is codeable if for all variables z occuring free 
in A A there is a word w in W such that z w. 

Definition 5. A message fj, is the result of coding a codable segment Aa- I.e. 
the result of replacing all variables that occur free in Aa by the corresponding 
words from the mapping Ta W 

Thus, coding a codable segment yields a public message. Upon receiving such 
a message, the recipient (B) can try to decode it. Basically, decoding is the in- 
verse of encoding, using the recipients mapping, Tb Vb, in the direction from 
words to types Note that, if a non-codable segment were encoded, subsequent 
decoding would yield a pseudo-context which contains unbound variables and 
hence cannot be an extending segment of any context. 



3.3 Example 

Take a simple situation where the agents A and B assume that they share all 
concepts related to their common vocabulary. This means that in the knowledge 
state of each agent p, all types related to a word in the vocabulary (by Tp W) 
are declared in their common context {Tp). We assume that the common vo- 
cabulary consists of English words®, hence messages appear in a sort of ‘toy 
English’; as segments where words have been substituted for some of their vari- 
ables, e.g. b : bundle, p : primary {b). In the table below, which only lists the 

® Here we assume that the mapping is one to one, and that this is a simple matter. 
In a more realistic setting, using natural language, such an assumption is no longer 
justified as words can be ambiguous. But, even then, the requirement that the result 
of decoding must be an extending segment of the receiver’s context can often be 
used to disambiguate the message succesfully, see [6] 

® To construct a more realistic mapping, not only the vocabulary but the whole lan- 
guage should be taken into account. For a mapping from type theory to English, see 

([ 7 ])- 
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shared knowledge of A and B, we see that agents A and B have a shared vo- 
cabulary that contains at least the words ‘bundle’/lens’, ‘primary’ and ‘enter’. 
The type variables corresponding to these words are declared in their common 
contexts and Note that A and B have different type variables that cor- 
respond to the same word: in A’s knowledge state the concept ‘lens’ is mapped 
to by the type X 3 , in B's knowledge state it is mapped to the type j/ 5 . 



d'A 


Ta W 


>Fb 




Xl ■■ *s, 


Xl bundle 


. . . , 




X2 ■ 


X2 primary 


2/5 : 


ye lens 


X3 ■■ *s, 


X3 lens 


2/6 : 2/5, 




X4 : Xl ^ X3 — 


^ *p, X4 enter 


. . . , 




X5 : Xl, 




2/17 : *S, 


yn bundle 


Xe ■ X3, 




2/18 : yn 








2/19 : yn - 


2/5 ^ *p, 2/19 enter 






. . . , 

2/32 : 2/17 - 


2/32 primary 



In this setting A can, for instance, encode the segment: u : x\,v : X3,z : 
X/i{uv) (with z,u,v ifA-fresh). This segment meaning ‘there is a bundle and 
there is a lens, and the bundle enters the lens’ to A, encodes into the message: 
u : bundle, v : lens, z : enter(uv). If the agent B decodes this, it ends up with the 
segment: u : yn, v : y^, z : yig{uv) which is meaningful to it. The segment can be 
shown to extend B's common context as Wb b yn ■ *p, '1 'b,u : yn h j /5 : *s, 
and 'I'b,u : yn,v: y5 h yi<j{uv) : *p. 

4 Polarity and Information Flow 

Depending on the situation, A and B may share more than just their vocabulary, 
even at the beginning of a dialogue. There may be certain general knowledge 
which they can correctly assume to share with their partner, or they may share 
certain information as a result of a previous conversation. This information will 
then also be represented in their common contexts. In our example, this is in 
fact the case: apart from the types related to the vocabulary, each agent has a 
representation for a particular bundle (xs and yis respectively) and a particular 
lens {xq and ye respectively). In their messages, both agents need to be able 
to refer to these individuals. To do so, they must make descriptions of these 
objects that can be understood by their dialogue partners. Thus we also need a 
mechanism that provides this possibility. 

So far we have described only one way of dealing with extending segments 
(Sect. 2.2): the agent simply appends the extending segment to his knowledge 
state. This is passive in the sense that the agent makes no effort whatsoever to 
connect the new information represented by the extending segment to the infor- 
mation already present in his knowledge state. The new information is simply 
stored as a set of additional ‘hypotheses’ or ‘assumptions’ (all justifications in 
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the segment are atomic). However, the receiving agent can also digest a decoded 
segment in a different, more active way by trying to find justifications (objects 
and proofs) in his own current contexts to replace the dummy inhabitants of 
the statements in the extending segment. In doing so, we say that the agent 
constructs a ‘realization’ for the extending segment in his original context. 



Definition 6. let A = xi: T\, ...,x„ : be an extending segment of F, and let 



r h £>i 



Ti and 



rhD2 : 


T2[xi := 


Di] 


and 






FhDs-. 


Tslxi := 


Di, 


X2 ■= T>2 


, and 




... and 












rhD„ 


Tn[xi := 


Di 


^n—1 


= Dn- 


_i] then we call 


II 

H 

III 
* 


: Pi 


, X-fi — 


Dn : 


Tn a realization 


substitution [xi := 


--Di 


, . . . , Xn ■ — 


Dn]. 





Processing a segment actively, the agent appends the realization A* to its context 
instead of the extending segment A. The point is that segments when used in 
this way, act as selective ‘hooks’ with which the rest of the message is connected 
to particular inhabitants in the knowledge state of the hearer. In fact, an actively 
processed segment does not provide the hearer with new information. Formally, 
this fact is reflected by the following proposition^, which shows that realizations 
can be eliminated: 



Proposition 1. Assume F, A\- B : C Let A* be a realization of A in F under 
the substitution [S'], then F, A* \- B : C and F \- B[S] : C[Sj. 

Thus, an extending segment Z\ of a context F can be used in two quite different 
ways: either as hypothesis extending the current knowledge state, or as a require- 
ment for which a realization is to be constructed in the current knowledge state. 
We call the former use of segments ‘positive’, the latter ‘negative’. These two po- 
larities determine the direction of the flow of information in our communication 
model. 



5 Communication 

The previous sections have shown how content that is privately meaningful to 
agent A can be encoded in a public message which can subsequently be decoded 
by agent B. If this process is succesful, the result of this decoding is meaningful 
to H; a segment extending its knowledge state. As we have seen, B can process 
this segment in different ways (‘passive’ or ‘active’), and B’s knowledge state has 
two parts (common context, and private context). In communication, the various 
possibilities the receiving agent has for processing a message can be used by the 
agent sending the message to achieve its communicative goals. By labelling (parts 
of) the message with tags stating where and how it should be processed, the 
sending agent can control the way the message is received by the other agent. As 

^ This is simply an iterated version of the ‘Substitution Lemma’ for PTSs, see ([3]). 
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a consequence, the expressions exchanged between two communicating agents in 
our model are labeled messages^ rather than just messages. Using more traditional 
terminology, we could say that a labeled message is the unit corresponding to an 
utterance, where the message carries the content of the utterance and the labels 
its pragmatic force. 

The following definition introduces notation for the two pairs of epistemically 
motivated labels we have encountered sofar, and describes their use: 

Definition 7. If a (part of a) message fi = Zi \ Wi, Zj \ Wj (where Wi, ... ,Wj 
are words from W , possibly followed by a number of arguments) is labelled 

— positive, notation: Zi : Wi, ..., Zj : Wj [zi, . . . , Zj\^ , the receiving agent has to 
append the segment resulting from decoding p to one of the contexts repre- 
senting its knowledge state. 

— negative, notation: Zi : Wi, ..., Zj : wj [zt, . . . ^zf\~ , the receiving agent has to 
construct a realization for the segment resulting from decoding p on one of 
the contexts representing its knowledge state. 

— common, notation: Zi : Wi,...,Zj : Wj [zi, . . . , Zj]^ , the location where the 
receiving agent has to process the segment resulting from decoding p is its 
common context. 

— private, notation: Zi : Wi,...,Zj : Wj [z,, . . . , Zj]r, the location where the 
receiving agent has to process the segment resulting from decoding p is its 
private context. 

As we will see, tags can apply to different variables within one message, specifying 
for each part of this message on which location and with what polarity it is to 
be processed by the receiving agent. 

In the next subsections we show how the ingredients presented sofar can be 
used by the agents in our model to perform two basic acts of communication: 
providing information, and obtaining information. 

5.1 Providing Information 

Suppose that agent A wants to provide information to agent B. In fact, A wants 
to tell B that the commonly known bundle, to which A itself refers as enters 
a lens, which at present A does not assume to be shared. In English, A might 
express this by the sentence ‘The primary bundle enters a lens®’. Using labeled 
messages, A can express this information as follows: 

b : bundle^p : primary{h)^ I : lens, q : enter{b, 1) [b,p]if [1, q\^ (1) 

This message {p) is an encoding of the segment b : xi,p : X 2 (b),l : xs,q : X 4 (b,l). 
The tags show that it has a positive and a negative part. The first, negative, 
part (pi) corresponds to description of the primary bundle: 

b : bundle, p : primary{b) [b,p](f (2) 



Note that in this sentence the commonly known bundle is referred to by a definite, 
and the privately known lens by an indefinite. 
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It instructs B to find a realization for b and p on its common context 
for the segment (zisi) resulting from decoding pi. Assuming that A’s use of 
the description was appropriate, B will find an object, say j/34, representing 
the bundle in its common context along with a proof that it is the primary 
bundle, say N. Agent B extends its common context with this realization, 
b = j/34 : yi7,p = N : 2/32(6), and proceeds by processing the second part of 
the message. This part {^2) contains the proposition asserted by A that for 
some lens I the primary bundle enters 1: 

I : lens,q : enter{b,l) [l,q]'^ (3) 

Note that the b that occurs free in p2 is now bound in the extended common 
context by the definition b = j/34 : yn. The second part of the message is 
tagged positively; B is supposed to ‘absorb’ the information in the segment 
{Ab 2 = I ■ V5,q ■ (2/19(6,/) resulting from decoding p2, i-e. add the statements 
in Ab2 to its common context. The first statement in Ab2 introduces a new 
lens (/) into B's common context. The second statement introduces a new ‘piece 
of evidence’ into the common context, a proof object (q) for the proposition 
that 6 enters 1. 

The processing of the entire tagged message therefore updates the common 
context of B in two steps: •f's with If'g = 'FbjA*^^ where A^j^ is a 

realization of Abi in '1 'b under substitution [S'], followed by with = 

W^,Ab2- According to proposition 1, A*^^ can be eliminated in favour of [S] 
yielding 'Fg = If', As2[S] (where Ab2[S] abbreviates the application of [S] to the 
statements in Ab2)- From this point of view the net effect of the entire message 
on the common context of B is an update with evidence for the proposition that 
for some lens (Z) the primary bundle (2/34) enters that lens. 

It should be noted that the reaction of agent B in this example is the simplest 
or ‘most cooperative’ one possible; it adds the information provided by A without 
questioning it in any way. Depending on factors in the dialogue situation not 
considered here, this reaction could be more ‘cautions’®. 

The succesful sending of labeled message by A not only affects the knowledge 
state of B, but also that of A itself. The message was sent in public, and hence 
affects the common context of A in the same way as the common context of B. 
As we described above for B, this results in an extension of the common context: 
'I' A => ^ith = ^A, A‘^^, A A2, or equivalently = ^a, Aa2[S] (where [S'] 

substitutes A’s representation of the primary bundle for 6). 

Privately agent A will know more, we assumed it had some justification for 
its message. In particular, A must have ‘evidence’ for the proposition expressed 
by the positively tagged part of the message, p2- The least we can assume about 
this evidence is that there exists a realization of Aa2 on its private context Fa, 
e.g.: Z\^2 = ^ = 3:35 : xs,q = M : 0:4(6,/). This realization shows in which 
respects A can know more in its private context than in its common context: 

® For instance, the DenK-system will not accept all assertions made by the user, 
because it is an expert on the domain whereas the user is a novice. 
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in Fa it knows which lens the bundle enters (xas : xa), rather than just ‘a 
lens’ { 1 ) in Fa- Moreover, in Fa it has a structured proof (M) for this rather 
than a ‘dummy’ {q) in Fa- Agent A can connect its ‘private’ justifications to 
its ‘common’ justifications by updating its private context with the realization: 
Fa ^ F'^ with F'j^ = Fa,A*^2- By adding these definitions, X35 is linked to I 
and M to q {b was already linked to A’s representation of the primary bundle by 
the update of Fa with Without this link, A would be unable to combine 

information about I and X35 in its private context. 



5.2 Obtaining Information 

Alternatively, one might suppose that A wants to obtain information from B. 
If A did not know which the lens primary bundle enters, he might ask in English 
‘Which lens does the primary bundle enter?’. To do this, A can use the same 
message as in the previous case but tagged differently: 

b : bundle, p : primary(b,l) : lens,q : enter{b,l) [b,p]'^ ("^) 

The first part of the message (pi) is tagged as before: 

b : bundle, p : primary{b) [b,p]'^ (5) 

Hence it will be processed by B in the previously described way yielding an 
update of Fb with a realization for the primary bundle. The tags for the 

second part of the message (P2) differ from those in the previous example two 
ways: 



I : lens,q : enter{b,l) [l,q]p (6) 

Firstly, fi2 now has a negative polarity, instructing agent B to view it as re- 
quirement. Secondly, p2 has to be processed on the private context. In other 
words, B is required to construct a realization on Fb for the segment {Ab2) 
resulting from decoding p2] it has to find a lens and construct a proof that the 
primary bundle enters this lens^°. We assume that B is able to construct such a 
realization, say I = 2/79 : 2/5, q = M : yig(l, b). At least one of the items in this re- 
alization must be ‘strictly private’ in the sense that it is available on Fb but not 
on the subcontext Fb, for the reason mentioned above: if the entire realization 
could be constructed on Fb, a realization could also be constructed on Fa and 
then A’s request for information would be superfluous. In this particular case, 
the lens (2/79) could be in Fb but the proof object cannot be derived on Fb- The 
update of its private context with the realization, Fb ^ Fp with Fp = Fb, A*^2, 
brings B in a position where it (privately) posseses all information needed to 
answer A’s question. 

Type theoretically this requirement is well-formed, Ab2 is an extending segment of 
Fb'- the variable b occurring free in A 32 is bound in Tb after the update of Fb with 
A*b\ because of the inclusion Fb ^ Fb- 



Communication Modelling and Context-Dependent Interpretation 



31 



Since the identity of objects cannot be communicated directly, B will have to 
describe the lens t/yg to A using common resources. For instance, if it is commonly 
known among A and B that the microscope contain a number of condensor lenses 
which are arranged in some order, B could send a labeled message to expressing 
that the primary bundle enters the first condensor lens to describe the lens 
to A. This labeled message, which again provides information, will update the 
common contexts of both A and S, in the way described in Sect. 5.1, with the 
information A wanted to obtain. 

6 Conclusions 

We have presented a simple model of communication for cases where two par- 
ticipants exchange information about a shared environment. The model is based 
on an explicit type-theoretical formalization of the knowledge states of the com- 
municating agents, which stresses the subjective nature of these states. This 
formalization of knowledge states by type theoretical contexts has an inherent 
notion of meaningfulness: not only do these contexts show which propositions 
and categories of objects an agent takes to be inhabitated at a given time, but 
also which types are well- formed for this agent given its knowledge state, i.e. what 
information is meaningful to it. 

In the model, we show how an agent can communicate content which is 
meaningful to itself to another agent by means of a shared (public) language, 
despite the subjective nature of its knowledge state. The fact that the agents 
share a language implies that each agent has an personal mapping between some 
of the types in its knowledge state and the constructs in the shared language. 
We show how an agent can encode content which is meaningful in its knowledge 
state in a public message, and how the agent receiving this message can decode 
it (using its own mapping) into information which is meaningful in its own 
knowledge state. As the examples show, information can be exchanged between 
agents and subsequently shared. 

Our approach to communication is not centered around the notion of truth, 
but tries to show how personal information becomes shared. Accordingly, it 
emphasizes information flow. This is reflected in the notion of polarity, which 
specifies the direction of the flow, and also in the importance that we attach 
to the various locations where the information can reside. It seems that such 
emphasis helps to get a computational understanding of various phenomena 
in dialogue. In fact, the direction of information flow underlies the distinction 
between questions and assertions, as well as that between definite and indefinite 
descriptions. 

Although the model is hardly elaborated here, we do show how utterances 
can be interpreted within the knowledge of the receiver, and how communication 
really leads to progress through an extension of that which is commonly known. 
In our model all information is distributed over the participants, even if a part of 
it is assumed to be shared. This realistic feature of the approach brings out the 
difficulties involved in referring, even when referring to commonly known objects. 
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In fact, we do not see how such reference would be possible without a context- 
dependent interpretation mechanism involving the construction of realizations, 
similar to the one sketched here. Interestingly, type theory offers the possibility 
to use this same mechanism to refer to reasons, i.e. justifications, as well. This 
provides a direct handle on the argumentive structure of the dialogue. 

A different matter is how all this works out in practice: actual agents in actual 
dialogues. In the Denk-project we constructed a man-machine interface based on 
the approach to communication sketched in this paper. The interface contains 
an artificial agent that reasons in type theory and interprets the utterances of 
user in a context-sensitive way ([2]). This shows that at least for a given domain 
and a small fragment of English our approach is feasible ([6]). 

We feel that our model could provide a point of departure for a computational 
theory of dialogue. We are strengthened in this conviction by the mutually com- 
patible theories of various linguistic phenomena that have already been formu- 
lated in this framework, such as: presuppositions ([8]), the resolution of definite 
descriptions (including anaphora and uses of deixis, ([4])) and question/ answer 
relations ([8]). 
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Abstract. We describe how the theory of Grobner bases, an impor- 
tant part of computational algebra, can be developed within Martin- 
Lof’s type theory. In particular, we aim for an integrated development 
of the algorithms for computing Grobner bases: we want to prove, con- 
structively in type theory, the existence of Grobner bases and from such 
proofs extract the algorithms. Our main contribution is a reformulation 
of the standard theory of Grobner bases which uses generalised inductive 
definitions. We isolate the main non~constructive part, a minimal bad 
sequence argument, and use the open induction principle [Rao88,Coq92] 
to interpret it by induction. This leads to short constructive proofs of 
Dickson’s lemma and Hilbert’s basis theorem, which are used to give an 
integrated development of Buchberger’s algorithm. An important point 
of this work is that the elegance and brevity of the original proofs are 
maintained while the new proofs also have a direct constructive content. 
In the appendix we present a computer formalisation of Dickson’s lemma 
and an abstract existence proof of Grobner bases. 



1 Introduction 

This work is part of a project to develop computational algebra completely 
within Martin-Lof’s type theory [NPS90], in an integrated fashion. Since the 
birth of the subject, algorithms in computational algebra have usually been ex- 
ternally developed: an algorithm is given and its correctness and termination is 
proved using classical logic. A possible reason for this approach is that classical 
abstract algebra is inherently non-constructive; e.g. existence-proof of primi- 
tive objects like prime and maximal ideals require Zorn’s lemma, a highly non- 
constructive principle. This makes the other approach difficult, the integrated de- 
velopment, where an algorithm is extracted from a constructive existence proof. 
The notion of integrated and external programming logics was introduced by 
Girard [Gir86], for a comparison between the integrated and external approach 
to program development, see [Dyb90]. 

Grobner bases together with an algorithm for computing them, was intro- 
duced by Buchberger [Buc65,Buc85]. It can be seen as a generalisation of the 
Euclidian algorithm for computing the greatest common divisor (gcd) of poly- 
nomials in several variables. In the case of polynomials in one variable, one can 
easily decide whether a polynomial / is in the ideal generated by the set of poly- 
nomials F = {/i, . . . , /„}: just compute the gcd of F, since it generates the ideal 
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of F, it is enough to check whether it divides /. This gives many algorithmic 
solutions to problems concerning polynomials in one variable. 

However, in the case of polynomials in several variables, this technique does 
not work. One problem is that F may not have a single generator of its ideal; 
hence one needs to define when a polynomial is divided by a set of polynomials. 
Another more important problem is that this technique is not complete: if F 
divides / then / is in the ideal of F, but / might be in the ideal of F even 
though F does not divide /. This is where Grobner bases come into play; a 
Grobner basis G of F is a finite set of polynomials which generates the same 
ideal and divides / if and only if / is in the ideal of F. 

There are several thorough presentations of Grobner bases and their applica- 
tions [Bnc85,Bnc98,BW93,GL097,Fr697]. However, these proofs are in general 
not constructive, which makes their development hard to translate into type the- 
ory. Our main contribution is a reformulation of the standard theory of Grobner 
bases which replaces the non-constructive arguments by the use of generalised 
inductive definitions [Acz77]. A natural question is then whether the original 
elegant arguments are lost forever in this constructive framework? We show that 
this is not so by isolating the main non-constructive principle, a minimal bad 
sequence argument, and use the open induction principle [Rao88,Goq92] to in- 
terpret it by induction. This leads to short constructive proofs which follow the 
original arguments, and which have been formalised on computer. 

In Section 2, we present a constructive existence proof of Grobner bases 
for polynomial rings with field-coefficients. This proof uses a short constructive 
proof of Dickson’s lemma [Dicl3] which was extracted from a classical proof us- 
ing open induction. During this work, we became aware of the work in [The98], 
where Thery presents a formal verification in Goq [HKPM97] of Buchberger’s 
algorithm for computing Grobner bases for polynomial rings with field coeffi- 
cients. This formal proof is constructive, except for a classical proof of Dickson’s 
lemma [Pot96]. A difference is that Thery ’s development is external; he starts 
with a program for Buchberger’s algorithm and proves that it computes Grobner 
bases, whereas we prove the existence of Grobner bases constructively in such a 
way that Buchberger’s algorithm is contained in the proof. 

In Section 3, we present a short constructive proof of Hilbert’s basis theorem, 
also extracted from a classical proof using open induction. This theorem can 
be used to prove termination of generalisations of Buchberger’s algorithm for 
computing Grobner bases for polynomials over principal ideal domains [BW93] 
and other algebraic structures [.JL91]. Another approach was taken by Jacobsson 
and Lofwall [JL91] who proved constructively Hilbert’s basis theorem by using 
Grobner bases and a different definition of Noetherian. 

This work is similar in spirit to Berger and Schwichtenberg [BS96] , where an 
algorithm for computing the gcd of natural numbers is extracted from a classical 
existence proof. A difference is that they extract the proof automatically from a 
formal classical proof, whereas we manually rephrase an informal classical proof. 
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2 Grobner Bases for Fields 

In this section we assume K to be an arbitrary field with a decidable equality, 
e.g. the rationale Q. We will consider the polynomial ring K[Xi, . . . , X^.], i.e. the 
set of polynomials in m variables with coefficients in K. A good suggestion on 
how to define polynomial rings in type theory can be found in [Jac95], where 
a formalisation in Nuprl [C01186] is described. We abbreviate monic monomi- 
als ■ ■ ■ X^ as A“, where a = (ki, km)- We say that that divides X^ 
if a <"* /3, where {ki,.--,km) <™ = ki < h k ■■■ k km < Im- 

When the m is clear from the context, we will omit it and write <. Note 
that A“ • X^ = A“+^, where a + P = {ki + li, . ■ ■ , km + Im), and ^ 
where a — P = {l\ — ki, . ■ ■ Pm — km) if divides A“. We define the least 
common multiple of a and P, lcm{a,P), as (max(A:i, ^i), . . . , max(fcm, ^m)), if 
O- — (/ci , . . . , km ) and P — (/i , . . . , Im ) ■ 

We assume a total compatible well-founded order on the monomials, that 
is an order where ai >~ 02 implies ai + P y a2 + P- One possible order is: 
first order in terms of total degree, then monomials with equal total degree are 
ordered lexicographically. If / is a polynomial, ci 7^ 0 and ciA“i is the highest 
monomial in / w.r.t. this order, we define hd f = ciA“^. We define the multi- 
degree of such /, mdeg{f), to be ai- 

Next, we define a reduction algorithm in the ring K[Xi , . . . , Xm]' 

Definition 1. A reduction of f after reducing by a set of non- zero polynomials 
G = {gi, . . . ,gn}, RED{f] G), is defined by >- -recursion on mdeg(/).' 

RED{0;G) = 0, 

RED{f- G) = [ ~ G G.mdeg(5i) < mdeg(/), 

’ \ /, otherwise. 

To make RED into a deterministic algorithm, one must decide on a strategy to 
choose the gi in the above clause; for example to try gi first, then g2, and so on. 
One problem is that the choice of this strategy might affect the result; consider 
e.g. / = XY^ -\- X^ gi = XY — 1 and g2 = — Y. if we always try to 

reduce by gi first, RED{f] gi, 32) = X -\-Y -|- 1, whereas if we always try g2 first, 
RED{f; gi,g2) = 2 A -I- 1. Another problem is that RED does not give a decision 
procedure for membership in ideals; e.g. RED{X; Y'^ -\- X ^Y) = X p Q for any 
strategy but X € IdlfY"^ -\- X,Y), where Idl{ai, . . . ,an) is the ideal generated 
by oi, . . . , a„. We say that a finite set is a Grobner basis, if we can use RED as 
a decision procedure to decide the ideal it generates: 

Definition 2. G is a Grobner basis (for the ideal it generate), if 
RED{f] G) = 0 whenever f € Idl{G). 

One can prove that RED{f; G) is unique and independent of strategy when G 
is a Grobner basis, see e.g. [BW93]. 
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2.1 Construction of Grobner Bases 



Rather than first giving an algorithm which constructs Grobner bases for ideals 
and then prove it correct, we will give a direct and constructive proof that for any 
finitely generated ideal, there exists a corresponding Grobner basis. Since this 
proof is constructive, it will in particular contain an algorithm for constructing 
Grobner bases. 

To motivate the development, consider the example set {Y^ + X, F}; this 
is not a Grobner basis as explained above, since X G Idl{Y^ + X,Y), but 
RED{X\ Y'^+X, Y) yf 0. The first step is to find a systematic method to generate 
all such possible counter-examples. 



Definition 3. 

is defined as 



Given two polynomials f and g, their S-polynomial, spol(/, g), 



spol(/, g) 



hdf'^ 



hd g ^ 



where a = lcm(mdeg(/),mdeg( 5 )). 



S-polynomials give a practical characterisation of Grobner bases: 

Theorem 4. G = {g\, . . . ,gt] is a Grobner basis if RED{spol{gi,gj);G) = 0 
for all i < j <t. 



Proof. In this case we prove that the set of elements / such that RED{f; G) = 0 
is closed by addition. Since this set is clearly closed by multiplication by mono- 
mials, this will imply that we have RED{f; G) = 0 for all / in Idl{G), as desired. 

We prove that if RED{f-G) = RED{g-G) = 0 then RED{f + g-G) = Q 
by induction on mdeg{f) and mdeg{g). The only case which is not direct is if 
we have mdeg{f) = mdeg{g), RED{f\G) = RED{f — migi\G), RED{g\G) = 
RED{g — rrijgj] G), where rui = riX°‘f nrij = VjX°^^ are suitable monomials, and 
mdeg{f — miPi) -< mdeg{f), mdeg{g — nijgj) -< mdeg{g). We can write 

gi = SiX^' + hi, gj = sjX^^ + hj 

with mdeg(hi) mdeg{gi) and mdeg{hj) -< mdeg{gj). We have then 

f + g - iTi + -^)X°‘'gt = mspol{gi, gj) + f - mig^ +g- nijgj 

for a suitable monomial m. By induction hypothesis the right hand side reduces 
to 0, hence so does f + g. 



A naive approach to compute a Grobner basis for a set is to add counter- 
examples (S-polynomials) to it until it satisfies the condition in Theorem 4. 
Quite surprisingly, this process will always terminate. The proof of this relies on 
a non-trivial combinatorial result known as Dickson’s lemma: for any sequence 
of monic monomials X°‘^,X °‘^, . . ., there will eventually he i < j such that AT“’ 
divides A“j . 
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2.2 A Constructive Proof of Dickson’s Lemma 

In this subsection, we present a constructive proof of Dickson’s lemma. The proof 
is a translation of the classical proof in Appendix A; the main non-constructive 
part, a minimal bad sequence argument, have been replaced by the open induc- 
tion principle [Rao88,Coq92]. Dickson’s lemma says that for any infinite sequence 
of n-tuples of natural numbers cti, (T 2 , . . there exists i < j such that ai <" fjj, 
where (ai, . . . , a„) <" (6i, . . . , 6„) if VO < i < n. Oi < 6^. Classically, a relation 
is called well^ if it satisfies the condition above. 

Remark 5. The proofs in this section only require < to be a decidable relation 
which is welDfounded on its underlying set A, with a < b defined as ^(6 < a). 
However, we will only instantiate the theorems for < being the less-than relation 
on N. 

We want to express in type theory, extended with inductive definitions, 
what it means for a relation R over a set B to be well. To this end, we de- 
fine Goodji{bo ■ ■ • bm) to be 3z < j < m. bi Rbj. We use an inductive definition 
of bar [ML68] to express that for any infinite sequence bobi • • •, Goodn{a) will 
eventually hold for an initial segment cr of 

Definition 6. Given a set B and a predicate P over the lists of B, we define 
inductively when the predicate P bars a, written P \ a: 

P{a) Vb. P I a.b 

P I cr P I a 

This is a generalised inductive definition [Acz77], which comes with a transfinite 
induction principle: 

Vp.P(p) ^ T{p), 

Vp. (V6. P I p.6) ^ {yb.T{p.b)) ^ <F{p) 

P I cr 'T(a) 

Here a.b is the list a extended with the element b. Intuitively, P | cr means that P 
will eventually hold for any extension of a. Classically, assuming the axiom of 
dependent choices. Good a | [] is provable iff R satisfies the classical definition 
of well. This justifies us to define R to be well iff Goodji | [] is provable. 

Lemma 7. Ifya,p,'^.P{ap) ^ P{a"fp), then^a, p,'f. P | crp ^ P | ajp. 

Proof. Immediate by induction on the proof of P | ap. 

Given two relations R and S over sets A and B respectively, the product 
relation, Rx S, over A x P is defined as (a, 6) {R x S) {a' ,b') = aRa' k, bSb' . 
Following [Coq92], we define a predicate M{a), expressing that an initial se- 
quence a of pairs in A x P is minimal w.r.t. <, by recursion on cr: 

M([]) = T, 

M{a.{x, b)) = M{a) & Vy. y < x ^ Wb. Good<x r \ cr.{y, b). 

^ in previous work, the relation was required to be a quasi-order (well-quasi order). 
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The predicate M will play the role of the minimal bad sequence in Appendix A.l; 
if M (rr) holds, and p has a lexicographically smaller sequence of first components, 
then Good<x r \ P should hold. 

Raoult’s open induction principle can be expressed using these definitions: 

Theorem 8 (Open Induction). For any finite sequence a of pairs in A x B, 
if M {a) and\/a,b. M{a.{a,b)) Good<xR I cr.(a, 6), then Good<xR \ o’. 

Proof. Assume cr to be a finite sequence such that M (cr) and Va, b. M {a. {a, b)) 
Good<xR I <J.{a,b). We prove ^x.'ib. Good<x R \ <J.{x,b) by induction on x\ 
Assume Vy. y < x ^ \/b.Good<xR I o’.(y,6). From this we directly obtain 
M{a.{x,b)), and by hypothesis, Good<x r \ a.{x,b). 

This theorem is a simplification of that in [Coq92] but it uses only generalised 
inductive definitions iterated once [Acz77]. It will interpret the argument: if 
Goodcx R I O’ holds under the assumption that cr starts a minimal bad sequence, 
then Good<x R I o’ holds without this assumption as well. Therefore, the classical 
proof of Dickson’s lemma in Appendix A can be interpreted as: 

Lemma 9. If Goodn \ b\- ■ - bm holds, then 

Va; ^ ( (^1 ; ) ’ ’ ’ (^m ^ ^m) ) Good<c x R \ (^m j ^m) ■ 

Proof. By induction on the proof of Goodn \ b\- ■ - bm'. 

Goodfi{bi ■ ■ ■ bm)' Then there exists a, i < j < m such that biRbj. Now, by 
cases on the decidable <, we have either ^(a;; > Xj), that means Xi < Xj so 
(xi,bi)(< X a) (xj, bj) holds, hence Good<x r((xi, bi) ■ ■ ■ (xm, bm))- Other- 
wise, Xi > Xj, and since M((xi,bi)- ■ -(xmfm)) implies M((a;i,6i)! • • -(3:^,6^)), 
we get Good<xR \ (xi,bi) ■ ■ ■ (xi-i,bi-i){xj,bj), and by Lemma 7, we are 
done. 

V6. Goodfi I (61 • • • bmb): Immediate by Theorem 8 and IH. 

Corollary 10 (Dickson’s lemma). For all n € N, Good<^ \ []. 

Proof. By induction on n: the case n = 0 is trivial. Otherwise, n = m -I- 1, and 
Good<m I [] holds. We instantiate the development above with R being <"*, and 
by Lemma 9 we are done. 

Dickson’s lemma implies the existence of Grobner bases for any finitely gen- 
erated (f.g.) ideal in K[Xi, . . . ,Xm\. 

Theorem 11. Every f.g. ideal F = Idl(fi , . . . , /„) has a Grobner basis G. 

Proof. We prove this using Dickson’s lemma. Let rui = mdeg{fi). Define Bad(a) 
as -^Good(a). We can assume that Bad{m\ ■ ■ ■ m„) holds; if nrii < mj for i < j, 
we repeatedly reduce F by dividing fj by fi. The result follows from Dickson’s 
lemma and the following lemma: 

Good< \ mi- - -mk ^ V/i, . . . , /fc. (VL Wi = mdey(/j)) ^ 

Bad{mi - - - mk) => 3G. G is a Grobner basis for fi,. . . ,fk, 

which is proved by induction on the proof of Good< \ mi - - - mk- 
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Good(jni ■ ■ ■ rrik)- This contradicts with Bad{mi ■ ■ ■ nik)- 

Vto. Good<c I mi ■ ■ ■ mum: Assume /i, . . . , with mdeg{fi) = mt for all i. Con- 
sider RED{spol{fiJj);fi,...,fk) for i^j. If all are zero, is 

already a Grobner basis by Theorem 4. Otherwise, let fk+i be the first 
RED{spol{fi, fj); /i, . . . , fk) ^ 0. Then mi ^ mdeg{fk+i) for all I < k, so if 
Bad(mi ■ ■ ■ mk), then Bad(mi ■ ■ ■ mkmdeg{fk+i)) ■ Hence, by IH, we have a 
Grobner basis for FLi{fk+i}, and since fk+i is in Idl{F), this is a Grobner 
basis for F. □ 

This is an integrated version of Buchberger’s algorithm: while F is not a Grobner 
basis, add normalised S-polynomials to F. Optimisations of the algorithm can 
be made by changing the proof, e.g. the order in which to compute the S- 
polynomials. 



3 Hilbert’s Basis Theorem 

In this section we prove constructively Hilbert’s Basis Theorem (HBT). This 
is used to prove termination of generalisations of Buchberger’s algorithm for 
computing Grobner bases for polynomials over principal ideal domains [BW93] 
and other algebraic structures [JL91]. 

In these more general cases, there are similar notions of reductions and ways 
of generating counter-examples and to decide whether the set is a Grobner basis. 
As in the previous section, we need to prove that the process of adding counter- 
examples to the ideal ends. This follows from HBT, which concerns Noetherian 
rings. There are several classically equivalent definitions of Noetherian: 

Definition 12. A ring R is Noetherian, if either of the following classically 
equivalent conditions holds: 

1. every ideal in R is finitely generated, 

2. there exists no infinite strictly increasing sequence of ideals, 

3. for every infinite sequence oq, ai . . . of elements in R, there exists an m such 
that a^Yi ^ Id I ^ uq , . . . , a,rji — i ) . 

HBT says that if i? is a Noetherian ring, so is R[Xi , . . . , Xm]- 

3.1 A Constructive Proof of Hilbert’s Basis Theorem 

In this subsection, we present a constructive proof of HBT. Again, the proof is 
a translation of the classical proof in Appendix A; as in the previuos section, 
the open induction principle [Rao88,Goq92] can be used to replace the minimal 
bad sequence argument. Gontrary to the constructive proofs of HBT previously 
known to us [Ric74,MRR88,.lL91], the proof extracted does not require decid- 
ability of the equality in the ring or its ideals. 

We want to express in type theory, extended with inductive definitions, 
the notion of Noetherian rings. To this end, we define Good n(ao ■ ■ ■ am) to 
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be 3k < m. ttk € Idl{ao, . . . ,ak-i) for any ring R. The inductive definition 
of bar gives a good constructive definition of Noetherian: classically, assuming 
the axiom of dependent choices, Goodn | [] is provable iff R satisfies the second 
condition in Definition 12. This justifies us to define a ring R to be Noetherian 
iff Goodn | [] is provable. 

Lemma 13. If Good^ \ ag ■ ■ ■ akU, then Gooda \ ag ■ ■ ■ ak-i(ak + J^f^g i"iai)a. 

Proof. By induction on the proof of Good a \ ag ■ ■ ■ Ofecr: If Goodn{ag ■ ■ ■ Ofecr), 
then it is direct since any DOi is in Idl{ag , . . . , Ok-i). Otherwise, we have 

Va. Goodii I ag ■ ■ • Okcra, and the result follows by IH. 

To prove HBT, it is enough to prove that for any finitely generated i?-module 
if R is Noetherian, then R\X] is Noetherian. In the rest of this section, 
we assume i?[X] to be a finitely generated i?-module. To interpret the minimal 
bad sequence argument in Appendix A. 2, we define a predicate M (cr), expressing 
that the finite sequence cr of lists of R is minimal, and prove a corresponding 
open induction principle for it. 

M([]) = T 

M{a.f) = M (a) & Wg. |g| < |/| ^ P{(T.g) 

where |/| is the length of the list /, and P(/i ■ ■ ■ fj,) = Goodn[x] I p{fi) ■ ■ ■ <f{fk) 
where (p{ag ■ ■ ■ Ok) = oq + aiX + • • • + OkX^ . If we write Sf = 0./, where 0./ 
is the list / with 0 put in front, then (fi{6f) = X(p{f). We use lists here rather 
than elements of i?[A] to avoid a decidable equality in R in the proof of HBT. 

Theorem 14 (Open Induction). For any sequence a of lists of R, if M{a) 
andyf.M(a.f) ^ P{a.f), then P{a). 

Proof. Assume a such that M{a) and 'if.M(a.f) ^ P{a.f). We prove 

yf.P(a.f) by induction on the degree of /: Assume Vg. |g| < |/| ^ P{a.g). 
This directly implies M{a.f), and by hypothesis, P{a.f). 

The classical proof of Hilbert’s Basis Theorem in Appendix A. 2 becomes: 

Lemma 15. Given ag, . . . , am of R such that Goodn \ (ag ■ ■ ■ am) holds, and a 
sequence /o, ■ ■ • ,/m of lists of R, M{{fg.ag) ■ ■ ■ (fm-am)) ^ P{{fo-ag) ■ ■ ■ (fm-am)) 

Proof. By induction on the proof of Good n \ ag ■ ■ ■ am '■ 

Goodn{ag ■ ■ ■ am)' Then there exists a, k < m such that Ofc G Idl{ag, . . . , ak-i), 
hence a^ = rgag + ••• + ru-iau-i for ro,...,rfc_i G R. Let g^ = fi.at. 
Either there exists 0 < i < k such that \gi\ > \gk\. In that case, since we 
have M{gg - ■ ■ gi), we obtain P{gg ■ ■ ■ gi-igu) and by Lemma 7 repeatedly, 
P{go • ■ ■ 9i-i9i ‘ ■ 9k ■ ■ ■ 9m)- In the other case, \gi\ < \gk\ for all i < k, and 
we construct 

fc-i 

2=0 
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where summation of lists of equal lenght are just pointwise addition, and 
scalar multiplication is taken pointwise. Since \g* \ < \gk \ and M (go ■ ■ ■ gk), we 
have P{go ■ ■ ■ gk-ig*)- Note that ip{g*) = ip{gk) - 
so by Lemma 13, P{go ■ • • 5fc-i<?fe), and by Lemma 7, we are done. 

Va. Goodji I tto • • • OmU: By IH, we get for any a G R and sequence /g, . . . , /m, / 
of lists of i?, M((/o.ao) • • • (/m-am)(/-a)) f’((/o-ao) ■ ■ • (/m-am)(/-a))- 

This also holds if f.a is replaced by the empty list, so by Theorem 14, 



Corollary 16 (Hilbert’s Basis Theorem). Goodji \ [] ^ Goodfi[Xi,...,x^] \ []• 
Proof. By induction on m and Lemma 15, since R[X] is a ring if i? is. 

4 Conclusions 

This work shows how a non-trivial part of classical mathematics can be trans- 
lated into constructive type theory by using the open induction principle. The 
constructive proofs share the elegance and brevity of the original proofs, and has 
a direct formalisation in type theory. This is shown in Appendix B below, where 
a computer formalisation of Dickson’s lemma and an abstract existence proof of 
Grobner bases is presented. 

This can be seen as a general and integrated development of Buchberger’s al- 
gorithm in a functional language. To be able to execute this program, one needs 
to formalise a polynomial ring over a field, the reduction function, and Theo- 
rem 4. For the remaining formalisation, one should be able to use the already 
existing work of Jackson [Jac95] and Thery [Thc98]. 

A Classical Proofs 

A.l A Classical Proof of Dickson’s Lemma 

Dickson’s lemma has a short classical proof which uses a minimal had sequence 
argument [NW63]: 

Proposition 17 (Dickson’s lemma). For all n G N, <” is well. 

Proof. By induction on n: If n = 0, it is trivial, so assume n = m + 1 and <"* 
is well. We prove that if < is well-founded and R is well, then < x i? is also 
well, where a < 5 is defined as ^{b < a). The proof is by contradiction: if < 
X i? is not well, then there exists an infinite sequence ui,U 2 ,. . . which is bad, 
i.e. for no i < j, Ui (< x R)uj. Using the axiom of dependent choices, we can 
construct a minimal bad sequence vi,V 2 ,. . . in the following way: choose vi = 
(xi, wi) with xi minimal among the first components of those pairs which starts 
a bad sequence. When v\,. . . ,Vk has been chosen, choose Vk+i = {xk+i,Wk+i) 
with Xk+i minimal among the first components of those tuples continuing a bad 
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sequence from vi, . . . ,Vk- The existence of such minimal bad sequence is the 
main non-const met ive part of the proof. 

Since R is well by assumption, there exists i < j such that Wi Rwj. But by 
construction, we must have Xi < Xj, otherwise XjWi would continue a smaller 
bad sequence. Hence (xi,Wi) (< x R) (xj^wj), which contradicts that vi,V 2 , ■ ■ ■ 
is a bad sequence. 



A. 2 A Classical Proof of Hilbert’s Basis Theorem 

We give a short classical proof of HBT taken from [BW93], which uses two of 
the equivalent conditions in Definition 12: 

Proposition 18. If R is a noetherian ring, then so is R[Xi , . . . , A„]. 



Proof. Since R[X] is a ring if i? is, it is enough to consider the case n = 1, the 
other cases follow by induction. Assume for a contradiction that I is an ideal of 
i?[A] and I is not finitely generated. Then I is not the zero ideal. We construct 
an infinite sequence of polynomials in / using the axiom of dependent 

choices: 

1. /o is a non-zero polynomial in / of minimal degree. 

2. /i+i is a polynomial in / \ /dZ(/o , . . . , ff) of minimal degree. 

It is clear that deg/j < deg/fc if j < k. We denote the (non-zero) head co- 
efficient of fi by Qi- Since R is noetherian, there exists an m such that am G 

Idl{ao, . . i-e. a™ = roaoH \-rm-iam-i where tq, . . .,rm-i € R- But 

then the polynomial 

m — 1 

r = fm-J2 

i=0 

must be in I \ Idl{fo, . . . , fm-i) since otherwise fm € /d/(/o, • . . , /m-i)- But 
deg/* < deg fm, contradicting the minimality of fm- 



B Formal Proofs in Agda 

Here we present some formal proofs in Agda [Coq98], a type-checker in the 
ALF-family for a variant of Martin-Lof’s type theory. This type theory is very 
similar to Cayenne [Aug98] , a functional programming language with dependent 
types. 
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B.l Dickson’s Lemma in Agda 



— Theory of Dickson’s lemma, taikes a wellf ounded relation gt and a well relation R 

— and proves leq x R (product of the two relations) to be a well relation. 

package thDickson (A,B : : Set) (gt : : Rel A)(R::Rel B) 

(wfgt::WF A gt) (dgt : :decRel A gt)(gR::WR B R) = 

let 

leq (a,b: :A) : : Set = Not (gt a b) 

leqxR (x,y::A*B) :: Set = (leq x.fst y.fst) & (R x.snd y.snd) 

GBarlR :: Pred (List (A*B)) = GRBar (A*B) leqxR 
in 

open OpenIndGoodRel (A*B) (fstR A B gt) leqxR (WFleml A gt wfgt (A*B) (fstR A B gt) 

(Fst A B) (\(x,y::A*B) -> \(h::gt x.fst y.fst) -> h)) 
use Min, open_ind 
in 

struct 

(:) (x : : A*B) (1 : : List (A*B)) :: List (A*B) = OCons x 1 

sndL :: Fun (List (A*B)) (List B) = let {f (a::A*B) :: B = a.snd} in map (A*B) B f 

lemO (l::List (A*B) ) (a: : A*B) (hi : :ExistsL B (\(x::B) -> R x a.snd) (sndL l))(h::Min 1) 

:: GBarlR (a:l) = 

let lem (vs::List (A*B)) (h: lExistsL B (\(xl::B) -> R xl a.snd) (sndL vs))(hl::Min vs) 

: : GBarlR (a:vs) = 
case vs of 

(Nil) -> case h of { } 

(Cons al as) -> case h of 

(Ini x) -> case dgt al.fst a. fst of 

(No no) -> ©Base (OInl (@Inl 

(struct{fst :: leq al.fst a. fst = no; 

snd :: R al . snd a.snd = x }))) 
(Yes a’) -> GRBarmon (A*B) leqxR (a:@Nil) as 
(hi. fst a a’) (al:@Nil) 

(Inr y) -> GRBarmon (A*B) leqxR (a;«9Nil) as 

(lem as y hi. snd) (al:@Nil) 

in lem 1 hi h 

leml (us::List (A*B) ) (hi : :GoodR B R (sndL us))(h::Min us) :: GBarlR us = 
case us of 

(Nil) -> case hi of { } 

(Cons a as) -> case hi of 

(Ini x) -> case as of 

(Nil) -> case x of { } 

(Cons a’ as’) -> lemO (a’:as’) a x h.snd 
(Inr y) -> GRBarmon (A*B) leqxR «9Nil as 

(leml (append (A*B) ©Nil as) y h.snd) (a:@Nil) 

keylem (us::List (A*B) ) (h :: GRBar B R (sndL us)) :: (hl::Min us)-> GBarlR us = 
case h of 

(Base hi) -> leml us hi 

(Ind f) -> \(hl::Min us)->open_ind us hi (\(u::A*B) -> keylem (u:us) (f u.snd)) 
keylem_cor : : WR (A*B) leqxR = keylem @Nil gR Ott 

— This is the general version of Dickson’s lemma. 

Dickson (A : : Set) (gt : : Rel A)(wfgt::WF A gt) (dgt : : decRel A gt)(n::Nat) 

: : WR (Vec A n) (VecRel A (NotR A gt) n) = 
case n of 

(Zero) -> OInd (\(a::Vec A @Zero)-> 

OInd (\(al::Vec A @Zero)-> ©Base (@Inl (©Ini ©tt)))) 

(Succ nl) -> let package thD = thDickson A (Vec A nl) gt (VecRel A (NotR A gt) nl) 

wfgt dgt (Dickson A gt wfgt dgt nl) 



in thD . keylem_cor 
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B.2 Abstract Existence Proof of Grobner Bases 

Below is a formal proof of the existence of Grdbner bases for ideals in an arbitrary 
polynomial ring (Poly). We do not need to assume any properties of a polynomial 
ring at this level. The development is general and captures the reasoning in 
Section 2; for instance the assumption bars: :GBar ONil can be instantiated by 
the formal proof of Dickson’s lemma above. Some parts of the proof terms were 
omitted to improve the presentation, these are denoted by 



package GBO (Poly::Set) 

(Z: :Poly) 

(plusP, timesP : : BinOp Poly) 

((==) : :Rel Poly) 

(P: :Poly->Pred (List Poly)) = 
let LP : : Set = List Poly 

ForallLP : : Pred Poly -> Pred LP = ForallL Poly 

ExistsLP : : Pred Poly -> Pred LP = ExistsL Poly 

Good:: Pred (List Poly) = GoodP Poly P 

GBar : : Pred LP = Bar Poly Good 

Bad : : Pred LP = NotP LP Good 

(:) (f : :Poly)(fs: :LP) : : LP = OCons f fs 



in 

open pkideal Poly (==) plusP timesP Z use Ideal, eql, eqvl, congl, ileml, ilem2 
in 

struct 

package GBl (spols::LP -> LP) 

(GB: :Pred LP) 

(RED:: Poly -> LP -> Poly) 

(deceq: :decRel Poly (==)) 

(ispol : : (f s : : LP)->ForallLP (\(g: :Poly)-> Ideal fs g) (spols fs)) 
(iRED: : (f : :Poly)->(fs: :LP)->Ideal (f:fs) (RED f fs)) 

(iREDP : : (g: :Poly) -> (gs : :LP)->P (RED g gs) gs -> RED g gs == Z) 
(Pprop: : (f : :Poly) -> (f s : :LP)->Exists Poly (\(g::Poly) -> 

eql (f:fs) (g:fs) & Not (P g fs))) 
(gbchar : : (gs : :LP) ->ForallLP (\(g’::Poly) -> RED g’ gs == Z) 

(spols gs) -> GB gs) 



(bars:: GBar ONil) 

= open eqvl use ire, isy, itr 
in 

struct 

leml (fs::LP) :: Or (ExistsLP (\(h :: Poly) -> Not (RED h fs == Z)) (spols fs)) 
(ForallLP (\(h :: Poly) -> RED h fs == Z) (spols fs)) 



= existsLleml Poly (\(h : : Poly) -> RED h fs == Z) 

(\(x : : Poly) -> deceq (RED x fs) Z) (spols fs) 



badlem (gs : :LP) (g : : Poly) (hi : : Not (RED g gs == Z))(b::Bad gs)::Bad ((RED g gs) :gs) = 
\(x :: Good ((RED g gs):gs)) -> case x of 

(Ini x’) -> hi (iREDP g gs xO 
(Inr y) -> b y 



remEql (f s : :LP) :: ForallLP (\(g’ :: Poly) -> eql ((RED g’ fs):fs) fs) (spols fs) = 
let 11 (sfs: :LP) 

(f 1 :: ForallLP (\(g: :Poly)-> Ideal fs g) sfs) 

:: ForallLP (\(g*::Poly) -> eql ((RED g’ fs):fs) fs) sfs = 
case sfs of 
(Nil) -> Ott 
(Cons a as) -> ... 

in 11 (spols fs) (ispol fs) 



GBof (F,G::LP) :: Set = eql G F & GB G 

thm (f s : : LP) (gb : : GBar fs)(b::Bad fs) :: Exists LP (GBof fs) = 
case gb of 

(Base h) -> elimNO (Exists LP (GBof fs)) (b h) 

(Ind f) -> 
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case leml fs of 

(Ini x) -> let 11 (sgs::LP) 

(f 2 : : ExistsLP (\(g’::Poly) -> Not (RED g’ fs == Z) ) sgs) 

(f 3 : : ForallLP (\(g’::Poly) -> eql ((RED g’ fs):fs) fs) sgs) 
: : Exists LP (GBof fs) = 
case sgs of 

(Nil) -> case f2 of { } 

(Cons a as) -> ... 

in 11 (spols fs) X (remEql fs) 

(Inr y) -> struct fst : : LP = fs 

snd : : GBof fs fs = struct fst : : eql fs fs = ire fs 
snd : : GB fs = gbchar fs y 

badprop (f : :Poly) (f s : :LP) (b : :Bad f s) (np : :Not (P f fs)) :: Bad (f:fs) = 

\(x :: Good (f:fs)) -> case x of 

(Ini x’) -> np x’ 

(Inr y) -> b y 

exbad (fs::LP) :: Exists LP (\(gs::LP) -> Bad gs & eql fs gs) 

- case fs of 

(Nil) -> struct fst : : LP = ONil 

snd : : Bad fst & (eql «9Nil fst) = . . . 

(Cons a as) -> ... 

cor (fs::LP) :: Exists LP (GBof fs) = let gs :: LP = (exbad fs).fst 

bad :: Bad gs = (exbad fs). snd. fst 
eq :: eql fs gs = (exbad fs). snd. snd 
th : : Exists LP (GBof gs) = ... 

in ... 
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Abstract. An extension of the simply-typed A-calculus, allowing itera- 
tion and case reasoning over terms of functional types that arise when 
using higher order abstract syntax, has recently been introduced by F. 
Pfenning, C. Schiirmann and the first author. This thorny mixing is 
achieved thanks to the help of the operator of modal logic S4. Here 
we give a new presentation of their system, with reduction rules, instead 
of evaluation judgments, that compute the canonical forms of terms. 
Our presentation is based on a modal A-calculus that is better from the 
user’s point of view. Moreover we do not impose a particular strategy 
of reduction during the computation. Our system enjoys the decidabil- 
ity of typability, soundness of typed reduction with respect to typing 
rules, the Church- Rosser and strong normalization properties and it is a 
conservative extension over the simply-typed A-calculus. 



1 Introduction 

Higher order abstract syntax ([PE88]) is a representation technique which proves 
to be useful when modelizing in a logical framework a language which involves 
bindings of variables. Thanks to this technique, the formalization of an (object- 
level) language does not need definitions for free or bound variables in a term. Nor 
does it need definitions of notions of substitutions, which are implemented using 
the meta-level application, i.e. the application available in the logical framework. 
Hypothetical judgments are also directly supported by the framework. 

On the other hand, inductive definitions are frequent in mathematics and 
semantics of programming languages, and induction is an essential tool when 
developing proofs. Unfortunately it is well-known that a type defined by means 
of higher order abstract syntax cannot be defined as an inductive type in usual 
inductive type theories (like CCI [Wer94,PM92], or Martin-Lof’s Logical Frame- 
work [NPS90] for instance). 

In a first step towards the resolution of this dilemma, Frank Pfenning, Carsten 
Schiirmann and the first author have presented ([DPS97]) an extension of the 
simply-typed A-calculus with recursive constructs (operators for iteration and 
case reasoning), which enables the use of higher order abstract syntax in an 
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inductive type. To achieve that, they use the operator ‘D’ of modal logic IS4 to 
distinguish the types ‘A ^ B’ of the functional terms we 11- typed in the simply- 
typed A-calculus from the types ‘DA — > B’ of the functional terms possibly 
containing recursive constructs. 

In this paper, we present an alternative presentation of their system that we 
claim to be better in several aspects. We use the same mechanism as theirs to mix 
higher order abstract syntax and induction but our typing and reduction rules 
are quite different. Indeed there are several presentations of modal A-calculus IS4 
([BdP96,PW95], [DP96]). We have chosen the variant by Frank Pfenning and 
Hao-Chi Wong ([PW95]), which has context stacks instead of simple contexts. 
This peculiarity creates some difficulties in the metatheoretical study but the 
terms generated by the syntax are simpler than those of [DPS97] (no ’let box’ 
construction), and so this system is more comfortable to use. 

Moreover, instead of introducing an operational semantics which computes 
the canonical form ( 77 -long normal form) using a given strategy, our system 
has reduction rules, which allow a certain nondeterminism in the mechanism 
of reduction. We have been able to adapt classic proof techniques to show the 
important metatheoretic results: decidability of typability, soundness of typing 
with respect to typing rules. Church- Rosser property (CR), Strong Normaliza- 
tion property (SN) and conservativity of our system with respect to the simply- 
typed A-calculus. The main problems we encountered in the proofs are on one 
hand due to the use of functional types in the types of the recursive construc- 
tors, and on the other hand due to the use of ? 7 -expansion. To solve the problems 
due to 77 -expansion, we benefit from previous works done for the simply- typed 
A-calculus ([.1G95]) and for system F ([Gha96]). 

In the second section of the paper, we introduce our version of the modal 
inductive system, its syntax, its typing and reduction rules. Then in the third sec- 
tion, we prove its essential properties (soundness of typing, CR, SN) from which 
we deduce that it is a conservative extension of the simply- typed A-calculus. 
Finally, we discuss related works and outline future work. A full version of this 
paper with complete technical developments is available in [Lel97]. 

2 The System 

In this section, we present the syntax, the typing rules and the semantics of our 
system. First, let us briefly recall our motivations. 



2.1 Higher-Order Abstract Syntax 

The mechanics of higher order abstract syntax (HO AS) has already been exposed 
in many places, for example in [HHP93]. Let us introduce here a simple example 
of representation using HOAS, that will be useful later when we illustrate the 
mechanism of the reduction rules. 
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Suppose we want to represent the untyped A-terms in the simply-typed A- 
calculus with no extra equations. We introduce the type L of untyped A-terms 
together with two constructors laun : (L — ^ L) — > L and app : L ^ L — > L. 

It is well-known ([HHP93]) that the canonical forms (/3-normal ry-long) of 
type L are in one-to-one correspondence with the closed untyped A-terms and 
that this correspondence is compositional. For instance the term of type L 
(larni Ax : L.(app x x)) represents the untyped A-term Xx.{x x). 

Now, these constructors do not define an inductive type in usual inductive 
type theories like the Calculus of Inductive Constructions ([Wcr94]) or the Ex- 
tended Calculus of Constructions ([Luo94]) because of the leftmost occurence 
of L in the type of constructor lam. If we allowed this kind of inductive defini- 
tion, we would be confronted with two serious problems. First, we would lose the 
one-to-one correspondence between the objects we represent and the canonical 
forms of type L — > • • • ^ L. For instance, if we have a Case construct (definition 
of a function by case over inductive terms), the term (lam Ax : L.Case x of . . . ) 
does not represent any untyped A-term. Moreover we would lose the important 
property of strong normalization; more precisely we could write terms which 
would reduce to themselves. Our goal is to introduce a system which repairs 
these deficiencies. 

Following [DPS97], we will use the modal operator ‘D’ of modal logic IS4 
to distinguish the types ‘A B’ of the functional terms well-typed in the 
simply-typed A-calculus from the types ‘DA ^ B' of the functional terms pos- 
sibly containing recursive contructs. For instance, in our system, a term such 
as ‘Aa; : L.Case xoi . . . ’ will have type DL L whereas constructor ‘lam’ will 
have type {L ^ L) ^ L. Thus, our typing judgment will rule out undesirable 
terms such as ‘(lam \x : L.Case a; of ...)’. 

2.2 Syntax 

The system we present here is roughly the simply-typed A-calculus extended by 
pairs, modality IS4 and recursion. We discuss the addition of polymorphism and 
dependent types in the conclusion. 

Types To describe the types of the system, we consider a countable collection 
of constant types Lj (j G IN ), called the ground types. In our approach, they 
play the role of inductive types. The types are inductively defined by: 

Types : T := L, | Ti ^ T 2 | Ti x T 2 | DT 

A type is said to be pure if it contains no ‘D’ operator and no product. 

Context stacks Following the presentation of [PW95], we have context stacks 
instead of simple contexts. As usual a context L is defined as a list of unordered 
declarations x : A where all the variables are distinct. A context stack A is 
an ordered list of contexts, separated by semi colons Li; . . . ; L„. ‘.’ denotes the 
empty context as well as the empty stack. 
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Notations. A context stack is said to be valid if all the variables of the stack are 
distinct. We call local context of a stack A = Fi; . . .;Fn the last context of the 
stack: lA. The notation ‘A, F’, where Z\ is a stack Fi; . . ,;Fn and T is a context, 
is the stack Fi ; . . . ; Fn, F. Similarly, the notation ‘A, A'\ where A is the stack 
Fi; . . ,;Fn and A' is the stack F {- . . is the stack Ti; . . . ; Fn, F{; . . F^n- 
If Z\ is a valid stack of m contexts A; • • ■ ; An and n is an integer, Z\" denotes 
the stack A where the last n contexts have been removed: A; • ■ ■ ; rm-n iin < m, 
and the empty stack if n > to. 

Terms We view open terms of type L, depending on n variables of type L, as 
functional terms of type = L ^ L —> L, as in [DH94]. For example, 

terms in the untyped A-calculus given in Section 2.1 will have three possible 
forms, envolving what we called higher-order constructors, written with vectorial 
notations: 



app : Xlt : L .(app P Q) 
lam : Xlt : L .(lam P) 
var : Xlt : L .Xi 

In general of course, the type of a constructor of a pure type L contains 
other types than L. Before describing the set of the terms, we consider a finite 
collection of constant terms (the constructors) Cj^k, given with their pure type: 
(a, fed ^ • • • — > Bj^k.unk) Bj , where each is a pure type and Lj is a 

ground type. If Uj^k = 0, the type of Cj^k is simply Lj. 

The terms are inductively defined by: 

Terms : M := x \ Cj^k \ {M N) \ Xx : A.M \ t M | | M | {Mi, M 2 ) 

I fst M I snd M I (cr)Case M of {Mj^k) \ (o’)!! M of {M'j^k) 

where cr is a function mapping the ground types Lj{j € IM ) to types, {Mj^k) 
and {M'j^k) are collections of terms indexed by the indexes of the constructors. 

We delay till Section 2.7 the explanation of the arguments of operators ‘Case’ 
and ‘It’. The modal operator ‘f introduces an object of type DA while the oper- 
ator ‘I’ marks the elimination of a term of type DA. As usual, terms equivalent 
under a-conversion are identified. 



2.3 Typing Rules for Case and It on a Simple Example 

We give here the typing rules for Case and It for the untyped A-calculus example 
of Section 2.1. Except for the use of the □ operator, and the use of instead 

of L, there are pretty standard for the app case. Note how the use of enables 

us to extend the usual case (app) to the fonctional case (lam) in an intuitive 
manner: 

A h M : \ALn A h ATapp • CIT„ > DAn — ^ A Z\ h M\q^ : □A„_|_i — > A 



A h ((T)Case M of Afapp : A„ 
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Zi h M : ULn A h M^pp : A ^ A ^ A Z\ h : {A ^ A) ^ A 
A h (cr) It M of M^pp 

where A = a{L) is the resulting type of the case or iteration process on M. 

The Case and It functions take as arguments the resulting values for the n 
variables of the term M being analysed; hence the resulting type An for both 
operators in the above rules. 

2.4 Examples 

Let us assume that we have defined the types of the integers ‘Nat’ by declaring 
two constructors ‘0 : Nat’ and ‘S : Nat — > Nat’. We can informally define the 
function which counts the number of bound variables in an untyped A-term by: 

— Count(app M N) = Count(M) + Count(A^) 

— Count(lam \x : L.{M a;)) = Count(M a;), where Count(a;) = 1 

This function can be implemented in our system using the It construct, where 
cr = {L I— > Nat}. Count has type DL DNat: 

Count := (((r)It \m,n : DNat. .(plus m n) Xp : DNat — *■ nNat.(p } {S 0))) 

The function ‘Form’ of type \2{L L) ^ Nat, which returns 0 if its argument 
is a free variable, 1 if it is an abstraction term and 2 if it is an application, can 
be defined as follows: 

Form M := (((r)Case M of Xu,v : □(L ^ L)..2 Xf : □(L ^ T — > L).l 0) 

2.5 Typing Rules 

The typing rules are a combination of the rules for simply- typed A-calculus, for 
pairs and projections, for modal A-calculus IS4 ([PW95]) and the new rules for 
the recursive constructs ‘Case’ and ‘It’. Due to lack of place we do not give the 
rules for pairs and projections in this extended abstract. The rules are written 
in Figure 1 with the following notations: 

Notations ■■■, Bj^k,n ^ pure types. Lj is an inductive type. 

(Tj)i=i,...,p is a collection, possibly empty, of pure types. Each Ti can be de- 
composed as Ti^ ^ Li, where Li is a ground type and each Ti^ is 

a pure type. 

i—p 

Given the types C, Di, . . . , Dp, we denote Di ^ ^ Dp ^ C hy LI Di.C. 

We define T'^ by: 

n := □(Ti ^ >Tp^T^)^ . □(Ti ^ >Tp^ ^ a{L,) 



The map cr from ground types to types is extended over pure types by the 
equation: a{A B) = a{A) cr{B). 
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Fig. 1. Typing rules 



These typing rules may seem complex at first sight but they are naturally 
derived from the behaviour of the Case and It operators with respect to reduction 
(sections 2.7, 2.8). 

Although expressed differently, our typing rules are similar to those 
in [DPS96] (in which one can find many examples), with a more user friendly 
modal core. 

2.6 Basic Properties 

The system allows the same basic stack manipulations as the modal A-calculus 
IS4 without Case and It ([PW95]). In particular, as usual, the typing judgments 
are preserved by thinning and strengthening. Later, these properties will still be 
true for typed reduction and the interpretations of types. 

The substitution rule is still admissible. 

The inversion lemmas are not totally trivial because our typing rules are 
not syntax-driven. If we try to type a term of type OA, we can always apply 
rule (Pop) as well as the structural rule for M. Nevertheless, they remain fairly 
simple (see [Lel97]). 

2.7 Reduction Rules for Case and It on a Simple Example 

Now, we turn to the reduction rules of our system. They are inspired by the re- 
duction rules for Case and It that have been suggested to us by Martin Hofmann 
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as a means to describe the evaluation mechanism of [DPS97]. These reduction 
rules are also the ones underlying the terms and induction principles presented 
in [DH94] in the Calculus of Inductive Constructions. Indeed this research was 
undertaken with this main idea in mind: our approach to HOAS (i.e. considering 
terms in = L ^ L instead of terms of type L ([DH94])) should lead 

to a much more elegant system than the usual approach. The result seems to 
confirm our intuition. 

First we show the reduction rules for Case and It in the simple setting of the 
example of Section 2.1. For the sake of simplicity we introduce some notations. 



Notations. For any type B, the type Bn (n € IN ) is defined by Bg := B 
and Bn+i := B —f Bn- We consider a map a from the inductive types to types 
such that a{L) = A, terms Mapp of type DLn DTri — > A, of type 

^ A, M'app of type A — > A ^ A and of type (A — > A) — > A. We 

define two macros ’case’ and ’it’ by: 



case M := (cr)Case M of Mapp 
it M := (fT)It M of M'app 



Reduction rules. In our example, the reduction rules for Case and It are the 
following ones: 



(case t : L .(app P Q)) ^ Xlt 
(case t ■ ^.(lam P)) ^ Xli 
(case t ■ L .Xi) ^ Xll 

(it t X~^ : if. (app P Q)) ^ Xlt 

(it t • if. (lam P)) ^ At? 
(it t -Xi) ^ XAi 



A. (Mapp t A’a? : L -P XlP : L .Q) 
T : if.P) 

A -Ui 

A.(M^app ((it T A"S? : L .P) ~Tt) 

((it t A'af : L .Q) At)) 

((it T A'S" : -P) it)) 

A ,Ui 



The first argument of the Case and It constructs, M, is the inductive term to an- 
alyze (representing an untyped A-term in our example). The second one, Mapp, 
is the function which processes the case of constructor ‘app’. The third ar- 
gument, Mjg^jjj, is the function which processes the case of constructor ‘lam’. 
Roughly speaking the ‘Case’ construct computes its result by applying Mapp 
or to the sons of its main argument. For iteration, the mechanism of re- 
duction is a bit different: the terms M'app and are applied to the result 

of ‘It’ on the sons of the main argument. Operationally, the effect of ‘It’ on a 
term M amounts to replacing the constructors lam and app by the terms 
and M'app in M (see [DPS97]). 

Now since we want to benefit from higher order declarations, the main argu- 
ment of Case/It may have a functional type. In particular we also want to be able 
to compute Case/It of a projection Xlt : L .Xi without a leftmost constructor. 
That is the reason for the functional type of Case/It constructs : they take as 
input the values of the computation for the projections (see [DPS97]). 
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2.8 Reduction Rules 

Now we describe the whole set of reduction rules. Given a term of our calculus, 
what we want to obtain at the end of the computation is the term of the object 
language it represents. As we have seen earlier (Section 2.1), the canonical forms 
(/3-normal 77 -long) are in one-to-one correspondence with the object terms. Thus 
we want the computation to return canonical forms. That means our reduction 
rules will incorporate 77 -expansion. 

The 77 -expansion reduction rule has been thoroughly studied 
(see [CK93,Aka93,JG95]). Adopting it forces us to restrict the reduction rules 
in some way if we still want Strong Normalization. Thus the reduction we will 
consider will not be a congruence (more precisely it will not be compatible with 
the application) and this will induce slight changes in the usual schemes of the 
proofs of the Ghurch-Rosser and Strong Normalization properties. 

The choice of 77 -expansion also means we have to keep track of the types of 
the terms. Indeed a term can only be 77 -expanded if it has type A —> B. Thus 
we will define a notion of typed reduction. 

The reduction relation is defined by the inference rules in Figures 2 (simple 
types and modality) and 3 (Gase and It). We have omitted the product rules 
and the compatibility rules other than (Appj), which are straightforward. 



(/3) 

(h) 



A h (Aa: : A.P Q) : B 
A h (Acc : A.P Q) ^ P[Q/x] : B 

A\- M ■. A ^ B M not an abstraction x fresh 



(Appi) 



A h M Aa; : A.{M x) ■. A ^ B 
Ah 77-step) A\-N-.A 
Ah(M N)^{M' N)-.B 



(/3D) 

(77D) 

(Pop) 



AhiTM: A 
A hit M M : A 
A hU M : DA 
A^]i M ^ M -.UA 
AhM-^A:!Q4 
A,P^M^N-.UA 



Fig. 2. Reduction rules. Simple types and modality 



As usual we define the relations and = (conversion) respectively as the 
reflexive, transitive and the reflexive, symmetric, transitive closures of 

3 Metatheoretical Results 

The classic properties of subject reduction, confluence and strong normaliza- 
tion have already been established for a modal A-calculus IS4 without induction 
([Lel98a]). Here we extend these results to the recursive operators Gase and It. 

3.1 First Results 

First, we state soundness of typed reduction with respect to typing rules. It is 
easily proved by induction on the derivation of the first hypothesis. 
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Theorem 1 (Soundness of reduction). 

If A\- M ^ M' : A then M : A and M' : A. 

The relationship between substitution and typed reduction is not as easy 
as in the simply- typed A-calculus. If P P' and Q Q' then we do not 
have any more P[Qlx\ P'[Q' /x] because of the side-conditions of reduction 
rules ( 77 ) and (Appj). Thus we only prove weak forms of the usual results. For 
instance, \i A, x ■. A \~ P ■. B and Z\ h Q ^ Q' : A, we only state that there 
is a term R such that A h P[Q/x] R : B and A h P[Q'/x] R : B. 
Nevertheless, these results enable us to prove the local confluence property: 

Lemma 1 (Local Confluence). 

If A \- M ^ N : A and A \- M P \ A then there is a term Q such that 
A\- N Q : A and A\- P Q : A. 

3.2 Strong Normalization 

Now we briefly sketch our proof of the Strong Normalization theorem for our 
system. The proof follows the idea of normalization proofs ‘a la Tait’ and is 
inspired by [Wer94] (for the inductive part) and [Gha96] (for the ry-expansion 
part). 
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Reducibility Candidates 

First we give a definition of the reducibility candidates ([GLT89]) adapted 
to our setting. Let us call A the set of our terms, defined in Section 2.2. 

Definition 1 (Reducibility Candidates). 

Given a type A, the reducibility candidates CRa are sets of pairs (A,M) of 
a context stack and a term. They are defined as follows: 

CRl y{A,M) £ C, M is strongly normalizing in A (i.e. there is no infinite 
sequence of reductions starting from M in A). 

CRl’ C C {{A, M) \ A^ M : A) 

CR2 V(Z\, M) G C such that A\- M ^ M' : A, we have (Z\, M') £ C. 

CR3 If M £ AfT, A \- M : A and VM' such that A \- M M' : A 
rj- expansion), {A,M') £ C then we have (A,M) £ C. 

CR4 If A = B C and (A, M) £ C then {A, Xz : B.{M z)) £ C, where z is a 
fresh variable. 

where NT = d \ ({Ax : A.M\M e d} U {{ M\M e 4} U |(M, N)\M,N £ d}). 

Note that instead of taking sets of terms, we consider sets of pairs of a 
stack and a term. Indeed, since, because of /^-expansion, our reduction is typed, 
it is convenient for the reducibility candidates to contain well-typed terms. In 
rule CR3, the restriction “Z\ h M ^ M' : A is not an ry-expansion” comes 
from [.1G95]. It has been introduced to cope with ry-expansions. The rule GR4 is 
also needed because of the ry-expansions ([Gha96]). 

As usual, if C and V belong to GRa then C C\ V belong to GR^ • Thus GRa 
is an inf-semi lattice. Next, we define the sets C ^ V, C xV, DC where C and T> 
are two reducibility candidates: 

Definition 2 (C T>, DC, C x T>). 

— C ■.= {{A,M) I Z\hM: and yr,y{{A, B), N) £C, 

{{A,r),{M N))£V} 

— DC := {(A, M) I A h M : DA and VA' stack s.t. (A, A') is valid, ((A, A'), 
i M)£C}. 

— C X T> \= {(A, M) I A\- M : A X B and VT context s.t. (A, T) is valid, 

{{A, r),fst M) £ C and {{A, T), snd M) £ V}. 

In the definition of DC, we need to extend the stack of contexts A with A' 
in order to get {{A, A'),M) £ DC whenever (A,M) £ DC (similarly to the case 
ofC^V). 

In the definition of C — > H, the context T added to the stack is essential; In 
the intermediate lemmas, it allows us to add fresh variables to the context. 

Proposition 1. IfC and T> are C.R., then C V, C x V and DC are C.R.. 
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Interpretation of types and contexts 

Following the sketch of normalization proofs ’a la Tait’, we define the inter- 
pretations of types. 

Definition 3 (Interpretations of types). 

— \Lj\ := {{A,M) \ A\- M ■. Lj and M is SN in A }, 

-lA^Bj := |A] ^ {Bj, 

-lAxBj := |A] X m, 

— If A is not pure, pA] := DlA] 

All the above interpretations are obviously C.Rs., except, maybe, for the first 
case: 

Proposition 2 (|Lj] is a C.R.). 

The set |Lj] is a reducibility candidate. 

In order to define pA] in the case A is pure, we have to take into account 
the fact that DA may be the type of the inductive argument of Case/It. The 
definition of pA] in this case involves the smallest fixpoint of a function we do 
not give here, because of space limitation (see [Lel97]). 

At this point, we have defined the interpretation of type |A] for all the 
types A. The following theorem stems from the definitions of the interpretations 
of types. 

Theorem 2 (|A] is a C.R.). 

Given any type A, the set |A] is a C.R. 

Then we define the notion of interpretation of context stack. Like in the 
classic case of the simply-typed A-calculus, the interpretation |zi]^ of stack A 
in stack iF is a set of substitutions from A to T but the definition is a bit more 
complex here because we have to deal with context stacks, instead of simple 
contexts. Thus we use a non standard notion of substitution. 

Definition 4 (Pre-substitution). 

A pre-substitution p from a stack A to a stack T is a mapping from the set of 
the variables declared in A into the set of the terms with all their free variables 
in T . 

A pre-substitution p can be applied to a term M with all its free variables 
in A. The result of this operation, denoted by p{M), is equal to term M where 
all its free variables x have been replaced by their images under p, p{x). 

Notations. Given two stacks A and T, a pre-substitution p from A to <F, 
a variable x not declared in A and M a term with all its free variables in T, 
we denote by p[x M] the pre-substitution from A,x : A to such that 
p[x I— > M]{y) = p{y) if y is declared in A and p[x ^ M]{x) = M. 

Given a stack A' such that A] A' is valid and a substitution p' from A' to <F, 
denotes the pre-substitution from A\ A' to T such that {p-,p’){x) = p{x) 
if X is declared in A and (p; p’){x) = p'{x) if x is declared in A' . 
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Definition 5 (Interpretation of context stack). 

Given two stacks A and W, the interpretation of A in W, is a set of 

pre-substitutions from A to W. It is defined by induction on A: 

— |.]ip is the singleton whose only element is the empty pre-substitution from . 
to 

— lr,x: A\^ is the set of the pre-substitutions p\x i— > Ml, where p belongs to 
|r]^ and {W,M) IS in |^1. 

— is the set of pre-substitutions p; p' such that p belongs to 
(n £ IN ) and p' belongs to 

where the notation tf'" has been previously defined in Section 2.2. 

In the definition of |Z\; the requirement that p belongs to |Z\].i/n, more 
fiexible than the requirement that p belongs to enables us to cope with the 
context stacks in the proofs. For example, we will have that p belongs to |Z\; 
whenever p belongs to \A\,p. 

Soundness of Typing The following lemma is proved by induction on the 
derivation of h M -.A. The most difficult case occurs for rule (t). It is solved 
by using the typing restrictions imposed by modality (see [Lel97]). 

Lemma 2 (Soundness of Typing). 

If A \~ M : A and p G then (F,p(M)) G |A]. 

The strong normalization theorem is then an easy corollary, using the fact 
that for any stack A, the pre-substitution identity from Z\ to Z\ belongs to |Z\]zi. 

Theorem 3 (Strong Normalization). 

There is no infinite sequence of reductions. 

3.3 Confiuence and Conservative Extension 

The confiuence property is a corollary of the strong normalization (Theorem 3) 
and the local confiuence results (this fact is often called “Newman’s Lemma”, 
after [Ncw42]). 

Theorem 4 (Confiuence). If A \- M N : A and A\- M P : A then 
there is a term Q such that A\- N Q : A and AGP Q : A. 

As usual, the ‘uniqueness of normal forms’ property is a corollary of the 
strong normalization and confiuence theorems. 

Corollary 1 (Uniqueness of normal forms). 

If AG M : A then M reduces to a unique canonical form in A. 

The conservative extension property uses the strong normalization result 
together with a technical lemma, that defines the possible forms of a canonical 
term [Lel97]. 



A Modal Lambda Calculus with Iteration and Case Constructs 



59 



Theorem 5 (Conservative extension). 

Our system is a conservative extension of the simply-typed \-calculus, i.e. if 
A \- M ■. A with A pure context stack and A pure type then M has a unique 
canonical form N which is pure . 



4 Related Works 

Our system has been inspired by [DPS97]. The main difference is that the un- 
derlying modal A-calculus is easier to use and seems to be better adapted to 
a future extension to dependent types. Splitting the context in two parts (the 
intuitionistic and the modal parts) would most probably make the treatment 
of dependent types even more difficult: how should we represent a modal type 
which depends on both non-modal and modal types? 

We also provide reduction rules, instead of a particular strategy for evalu- 
ation. Finally, due to that latter point and the fact that we have adapted well 
known proof methods, our metatheoretic proofs are much more compact and 
easier to read. 

Raymond McDowell and Dale Miller have proposed [MM97] a meta-logic 
to reason about object logics coded using higher order abstract syntax. Their 
approach is quite different from ours, less ambitious in a sense. They do not give 
a typing system, supporting the judgments-as-types principle, but two logics: 
one for each level (object and meta). Moreover they only have induction on 
natural numbers, which can be used to derive other induction principles via the 
construction of an appropriate measure. 

Frank Pfenning and Carsten Schiirmann have also defined a meta-logic ‘AI 2 ’, 
which allows inductive reasoning over HOAS encodings in LF([PS98]). It was 
designed to support automated theorem proving. This meta-logic has been im- 
plemented in the theorem prover Twelf, which gives a logical programming in- 
terpretation of M 2 - Twelf has been used to automatically prove properties such 
as type preservation for Mini-ML. 

From our definition of valid terms in an object language = L — > • • • — > 
L — > L implemented in the Calculus of Inductive Constructions, we derived 
an induction principle, that we claimed to be more natural, and more pow- 
erful, than the usual ones ([DH94]). Martin Hofmann recently formalized this 
induction principle in a modal meta-logic, using categorical tools [Hof99]. In this 
paper, he very nicely formalizes and compares, on the categorical level, several 
representations of terms using HOAS, and several induction principles currently 
used, sometimes without justifications, for fonctional terms. 

5 Conclusion and Future Work 

We have presented a modal A-calculus IS4 with primitive recursive constructs 
that we claim to be better than the previous proposition [DPS97]. The con- 
servative extension theorem, which guarantees that the adequacy of encodings 
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is preserved, is proved as well as the Church-Rosser and strong normalization 
properties. 

Our main goal is now to extend this system to dependent types and to poly- 
morphic types. This kind of extension is not straightforward but we expect our 
system to be flexible enough to allow it. We have already proposed an extension 
of our system to dependent types, only with a “non-dependent” rule for elimi- 
natin for the moment [DL99,Lel98b]. A full treatment of dependent types would 
have given an induction principle that we did not succeed in justifying in our 
setting. The work by Martin Hofmann [Hof99] suggests that we should be able 
to go further in this direction. 

Another interesting direction of research consists in replacing our recur- 
sive operators by operators for pattern-matching such as those used in the 
ALF [MN94] system, implementing Martin-Lof’s Type Theory [NPS90]. Some 
hints for a concrete syntax for that extension have been given in [DPS97]. F. 
Pfenning and C. Schiirmann are currently working on the definition of a meta- 
logic along these lines. 
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Abstract. We consider a class of logical formalisms, in which first-order 
logic is extended by identifying propositions modulo a given congruence. 
We particularly focus on the case where this congruence is induced by 
a confluent and terminating rewrite system over the propositions. We 
show that this extension enhances the power of first-order logic and that 
various formalisms, including Church’s higher-order logic (HOL) can be 
described in our framework. 

We conjecture that proof normalization and logical consistency always 
hold over this class of formalisms, provided some minimal conditions 
over the rewrite system are fulfilled. We prove this conjecture for some 
subcases, including HOL. 



1 Introduction 

1.1 Motivations 

A proof-system implements a given logical formalism. The choice of this for- 
malism is important, since in the field of actually mechanically checked formal 
proofs, logical formalisms are required not only to be expressive (logical com- 
plexity), but also practicable. More precisely, some important issues are: 

— The conciseness of proofs: in recent practical developments, it clearly ap- 
peared that the size of the proof-object and thus its handling and the prac- 
ticability of proof-checking can become critical; and the formalism in which 
this proof is expressed is an important factor to that respect. 

— A side-effect of the latter is also that smaller proofs often reflect more closely 
the mathematical intuition. In other words, this allows the user to better 
grasp the mathematical object he/she produces. 

— Last but not least, automatic proof-search and more generally computer- 
provided user-help depend upon the chosen formalism. It is well-known that 
proof synthesis algorithms are expressed more or less clearly in different 
logics. 

In this respect, a particular attention has often been given to the distinc- 
tion between calculation and reasoning steps. Schematically, the first can be 
unambiguously and mechanically performed and reproduced; whereas the latter 
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correspond to the application of a logical inference rule, whose choice is the re- 
sponsibility of the author/user. As a consequence, the calculation steps can be 
omitted in the proof objects. A typical instance is the conversion rule of type 
theories; a typical application is recent work using computational reflection like, 
for instance, [1]. 

1.2 Systems Modulo 

Theorem proving modulo is a way to remove computational arguments from 
proofs by reasoning modulo a congruence on propositions. This idea is certainly 
not new. For instance, in a language containing an associative binary function 
symbol -I-, Plotkin [li] proposes to identify propositions such as P{{a -I- 6) -I- c) 
and P{a + (6 -I- c)) that differ only by a rearrangement of brackets. Similarly, 
the conversion rule of type theories [9,2,10] a.o. identifies propositions w.r.t. 
generalized /3-reduction: the propositions 1 -1- 1 = 2 and 2 = 2 are logically 
identical. 

In [3] , we have proposed to use this idea in the definition of first-order logic 
itself. In the simplest cases, we can define first a congruence on terms (e.g. iden- 
tifying the term (a -I- 6) -I- c with the term a -I- (6 -|- c)) and then extend this 
congruence to propositions. However, in some cases, we want to define directly 
the congruence on propositions. A striking point is that adding well-chosen con- 
gruences enhances the logical expressivity of the formalism; typically, it leads to 
a first-order and axiom-free presentation of Church’s higher-order logic (HOL). 
A interesting application is that enforcing the distinction between calculation 
and reasoning leads to a very nice clarification of higher-order resolution. See [3] 
for details. 

1.3 About this Work 

In this paper, we study theorem proving modulo from the proof-theoretic view- 
point and more particularly the properties of cut-elimination and consistency. 
Proof normalization for such proof systems does not always hold and we present 
several counter-examples below; but we conjecture that proofs always normal- 
ize for congruences that can be defined by a confluent and terminating rewrite 
system, which rewrites terms to terms and atomic propositions to arbitrary ones. 

In this paper we show some particular cases of this conjecture: we show 
that proof normalization holds for our presentation of higher-order logic, for all 
congruences defined by a confluent and terminating rewriting system rewriting 
terms to terms and atomic propositions to quantifier free propositions and for 
positive rewrite systems i.e. ones in which the right hand side of each rewrite 
rule contains only positive occurrences of atomic propositions. 

2 Deduction Modulo 

As already mentioned, the class of systems we consider here, are all built on top of 
the logical rules of first-order logic. The actual expressive power being determined 
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solely by the choice of the congruence. In this present version, we only consider 
the natural deduction presentation and restrict ourselves to intuitionistic logic; 
our results can be extended to classical sequent calculus, but this requires some 
more attention and space. We refer to [4] for extensive details as well as for 
detailed proofs. 



2.1 Natural Deduction Modulo 

We place ourselves in many-sorted first-order logic. The definitions below are 
well-known and thus not too detailed. 

We consider a countable set of sorts, whose elements will be denoted by 
s,s',si .... The set of object variables is numerable, and every variable has an 
associated sort; we write x^,y^ .... We give ourselves a set of function symbols 
and of predicate symbols. Each of these comes with its rank. The formation rules 
for objects and propositions are the usual ones: 

— If / is a function symbol of rank (si, . . . , s„, s') and t\, . . . ,tn are respectively 
objects of sort si, . . . , s„, then f{t\, . . . , is a well-formed object of sort s' . 

— If P is a predicate symbol of rank (si,...,Sn) and t\, . . . ,tn are respec- 
tively objects of sort si, . . . , s„, then P{t\, . . . ,tn) is a well-formed atomic 
proposition. 

Well- formed propositions are built-up from atomic propositions, from the usual 
connectors =>, V, A, T, and the quantifiers V and 3. Remark that, implicitly, 
quantification in Vx^P or 3a;®P is restricted over the sort s. 

In what follows, we will often omit the sort of variables, simply writing x, y, 

etc. 

In order for proof-checking to be decidable, we assume that various relations 
are decidable (equality over variables, the sort of variables, the rank of symbols, 
etc). 

Finally, let = be a decidable congruence on propositions. 

Figure 1 gives the rules of natural deduction modulo this congruence. As 
mentioned above, proof checking is decidable, since we provided the necessary 
assumptions and the needed information in the quantifier rules. 



2.2 Equivalence 

Proposition 1. (Equivalence) For every congruence =, there exists a theory T 
such that 

TP P if and only if P \-= P 



Proof. Take the theory T containing all the propositions P ^ Q where P = Q. 
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(axiom) -7 A e r and A = A' 

r h= A 

(^-INTRO) =(A^B) (^-elim) ^ = (A^B) 



r\-=c 



(A-iNTRo) rh=c ^ ={Aab) 

(A-elimI) ^ = (Aa B) (A-ELIM2) ^ 1^" p C* = (M A -B) 

1 l~ = A 



B\-= B 



r \~— A r \~— B 

(V-INTROl ) C = {Av B) (V-INTR02) ^ ^ C = {Av B) 

1 l~ = o 



ri-= c 



r h= D r, A h= c r, B h= c 
(v-elim) rPTc D={AyB) 

r \~— B 

(-L-elim) = B 

r h= A 

(V-INTRO) ~ B = (Vx A) and x not free in F 
F r= B 

(V-elim) = (Va; A) and C = {\t/x]A) 

(3-intro) = (3a; A) and C = (\t/x\A) 

1 \~= B 

B\-=C B,A\-= B 

(3-elim) — — C = (3a; A) and x is not free in B, B 

B r= B 



Fig. 1. Natural deduction modulo 



2.3 Rewriting 

The framework we have defined up to here is extremely general. In the following, 
and to study proof-theoretic properties, we mainly deal with the case where the 
congruence = is generated by a rewriting relation. The definition is straight- 
forward. 

Definition 1. ITe say that a congruence = is defined by a confluent and termi- 
nating rewriting system TZ rewriting terms to terms and atomic propositions to 
arbitrary ones when P = Q if and only if P and Q have the same normal form 
for the system TZ. In this case, the congruence = is decidable. 

Remark B The definition above can be slightly generalized allowing non- 
oriented equations relating terms to terms and atomic propositions to atomic 
propositions (for instance commutativity). To this end we consider a class rewrite 
system TZ£ formed with a rewrite system TZ rewriting atomic propositions to 
propositions and a set £ of equations equating atomic propositions with atomic 
propositions and terms with terms and defining a congruence written =£ . 
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Given a system TZ£, the term t TZ£ -rewrites to t' , if t =s u[a{l)]i^ and 
t' =S u[cr(r)]tj, for some rule Z — > r S 7?,, some term u, some occurrence to 
in u and some substitution cr. 



2.4 Examples 

For matters of space, we only provide two examples here. 

Example 1. (Simplification) In an integral ring, we can use the usual simplifica- 
tion rules over objects like axO— >0,ax(6-|-c)— >ax6-|-axc, etc. But we 
can also add the following rule for simplifying equalities: 

ax6=0^a=0V6=0 



or the rule 

ax6 = axc— >a = 0V6 = c 

Example 2. (Higher-order logic) As mentioned above, deduction modulo allows 
to capture formalisms which go beyond the usual field of first-order logic; here 
is a faithful encoding of Church’s higher-order logic. 

The sorts are Simple types inductively defined by 

— L and o are simple types, 

— if T and U are simple types then T ^ {7 is a simple type. 

The language C is composed of the individual symbols 

— St,u,v of sort {T ^ U ^ V) ^ {T ^ U) ^ T ^ V, 

^ Kt,u of sort T ^ U ^ T, 

— =», A, V of sort o — > o ^ o, _L of sort o, 

— Vt and 3t of sort (T ^ o) ^ o, 

the function symbols 

— aT,u of rank (T ^ [/, T, C/), 
and the predicate symbol 

— £ of rank (o). 

As can be guessed, St,u,v and Kt,u are typed combinators and used to rep- 
resent the functions which are the objects of HOT. The objects and functions 
=»,A,V,_L,Vt and 3t allow to represent propositions as objects of sort o. Fi- 
nally, the predicate e allows to transform such an object t : o into the actual 
corresponding proposition e(t). This typical reflection operation appears clearly 
in the rewrite rules: 
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a{a{a{S, x),y),z) 
a{a{K,x),y) 
e(a(a(^,x),y)) 
£(a(a(A,x),y)) 
£(a(a(V,x),y)) 
£( 1 ) 
£(a(V,x)) 
£(a(3,x)) 



a(a(x,z),a(y,z)) 

X 

£(x) ^ £(y) 

£(x) A £(y) 

£(x) V £(y) 

_L 

Vy £(a(x,y)) 

By £(a(x,y)) 



3 Reduction and Cut-Elimination 

We now turn to the study of cut-elimination. Since we here place ourselves 
in a natural deduction framework, this result boils down to the normalization 
property with respect to /3-reduction. It is possible to define what is a normal 
(or cut-free) proof directly on the natural deduction derivations. For matters of 
space, we omit this here, and go directly to defining the typed A-terms underlying 
proofs. 

3.1 Proof- Terms 

Following Heyting semantics and Curry-Howard isomorphism we write proofs 
as A-terms typed by propositions of first-order logic. These terms can contain 
both variables of the first-order language (written x,y,z...) and proof variables 
(written a,/3, ...). Terms of the first-order language are written while 

proof-terms are written •7T,p,... 

Definition 2 (Proofs). 



7T ::= a 

I Aa 7T I (tt 7t') 

I (tt, tt') I fst{Tr) I snd{TT) 

I I I (<^ 7^1 “7^2 Ptts) 

I {botelim tt) 

I Ax TT I (tt t) 

I (t, tt) I {exelim tt xair') 

As it is now usual, A-abstraction models the V-intro and =i>-intro rule, ap- 
plication the corresponding elimination rules, the pair construct models the A- 
introduction, etc. 



68 



Gilles Dowek and Benjamin Werner 



r h= a : A 
Fa : A h= 7T : B 



r\-= XaTv:C 

r\-=TT-.c r h= 7 t' : A 

r h= (tt 7 t') : B 

ri-= 7T : A r h= tt' : B 
r h= (7r,7r') : C 

r\-=TT-.c 



- (axiom ii a : A £ F and A = A') 
(^-intro if (7 = (A => B)) 



(^-elim if C = (A B)) 
(A-intro if C = (A A B)) 



r h= /sf(7r) : A 
Bh= TT : C 
F h= snd{Tv) : B 
F\-=n : A 
F\-= i(n) : C 

ri-= 7T : B 



(A-elim if AB)) 

(A-elim ifC = {AAB)) 
(V-intro if C = (A V B)) 
(V-intro if C = (A V B)) 



B h= i(7r) : C 
B h= TTi : B Ba : A h= 7T2 : C B/3 : B h= Trg : C 



B h= (5 7Ti o:7r2 /Jtts) : C 
B h= 7T : B 



(V-elim if B) = (A V B)) 



(_L-elim if B = _L) 



B h= {botelim tt) : A 
B h= TT • A 

^ (V-intro if x is not free in B and B = (Va; A)) 



B h= Xx TV : B 
B h= 7T : B 
B h= (tt f) : C 
Bh= 7T : C 



Bh= (f,7r) : B 
B h= TT : C Fa: A\-=tv' ■. B 
F h= (exelim tt xan') : B 



{{A,t) V-elim if B = (Vx A) and C = {[t/x]A)) 
{{A,t) 3-intro if B = (3a; A) and C = ([f/a;]A)) 



(3-elim if x is not free in BB and C = (3a; A)) 



Fig. 2. Typing rules 



Figure 2 gives the typing rules of this calculus. As can easily be seen, we have 
a typed A-calculus, with dependent products. The only originality is that types 
are identified modulo =. 



Remark 2. An alternative presentation of this type system would thus be to take 
simply the usual ATT-calculus extended with dependent pair types (or A- types), 
but with a generalized conversion rule: 



B h= t : A 



B t : B 



(if A = B) 



Obviously a sequent Ai, ..., A„ h= B is derivable in natural deduction modulo 
if and only if there is a proof tt such that the judgment ai : Ai, ..., : A„ 

TT : B is derivable in this system. 
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3.2 Proof Reduction Rules 

As usual, the process of cut elimination is modeled by (generalized) /3-reduction. 
The following reductions are usual: 

Definition 3. 



(Aa 7Tl 7T2) 
fst(7ri,7T2) 
Snd(TTl,TT2) 
<3(i(7ri),a7r2,/37T3) 
'5(j(’^i)>«7r2,/37r3) 

(Aa; 7T t) 
{exelim (t, tti) axT: 2 ) 



[7r2/a]7Ti 

7Tl 



■X2 

[7ri/a]7T2 

[7ri//3]7T3 

[t/a;]7r 

[t/a;,7ri/a]7r2 



A proof is said to be normal (respectively normalizing or strongly normalizing^ 
if and only if the corresponding X-term is normal (respectively normalizing or 
strongly normalizing). We write SAf for the set of strongly normalizing proofs. 

Theorem 1. Provided = is defined from a rewrite system verifying the condi- 
tions of definition 1, there is no normal proof of the sequent [] h= _L. 

Proof. A normal closed proof can only end by an introduction rule. Thus, we 
should have a congruence like A = A A B of J- = A \/ B, which is impossible. 



3.3 Counter-Examples to Termination 

To illustrate the subtle link between the combinatorial properties of the rewrite 
system TZ (termination, confluence,. . . ) and the logical properties of the induced 
formalism (consistency, cut elimination), we here provide two systems where 
these properties do not hold. 

Example 3. (Russell’s paradox) 

Consider the following rewriting system 

S 

Modulo this rewriting system, the proof Aa : i? (a a) Aa : i? (a a) has type S. 
The only way to reduce this proof is to reduce it to itself and hence is not 
normalizable. 

An instance of this rule is skolemized naive set theory. In naive set set theory 
we have the following axiom scheme 

Vxi ... Va;„ Vz ( 2 ; G y P) 

for any propositional expression P. 
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Skolemizing this scheme, we introduce for each proposition P a 
symbol fxi,...,x„,z,p and an axiom 

Vxi ... VXtx Z (z G fx-i,...,Xn^Z^p{_^lt 

This axiom can be turned into the rewrite rule 

€ fxu-,x„,z,p{xi,...,Xn) P 

In particular, we have a rewrite rule 



^ G fz,{zez)^p (z € z) ^ -L 

and hence writing R for the proposition fz,{zGz)^± G /z,(z6z)^_l and S for the 
proposition _L we have 

S. 

We thus reconstructed Russell’s counter-example to consistency and cut elimi- 
nation for naive set theory. 

Example 4- (Crabbe’s counter-example) 

Even if Zermelo’s set theory (Z) is considered to be consistent, it is well- 
known that cut elimination is problematic and does generally not hold. The proof 
of non-normalization is called Crabbe’s counter-example (see [8,5] for details). 
Again, it is here illustrated by the fact that the straightforward encoding of Z 
as a deduction modulo necessitates a non-terminating rewrite system. 

Consider the following rewriting system 

C^EA{C^D) 

Modulo this rewriting system, the proof 

Xa : C {snd{a) a) {f3,Xa : C {snd{a) a)) 

is a proof of D in the context E. The only way to reduce this proof is to reduce 
it to 

(snc?(/3, Aa : C (snd{a) a)) (/3, Aa : C (snd{a) a))) 
and then to itself 



(Aa : C {snd{a) a) {(3,Xa : C {snd{a) a))) 

Hence it is not normalizable. 

An instance of this example is skolemized set theory. In set set theory we 
have an axiom scheme 

Vxi ... Wxn Vw 3y \/z {z £ y <=> {z £ w) A P) 

skolemizing this scheme, we introduce for each proposition P a 
symbol fxi,...,x„,z,p and an axiom 

Vxi ... \/Xn Vz {Z £ fxi,...,x„,z,p(xi, ...,Xn,w) AA {z £ W A P)) 
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This axiom can be turned into the rewrite rule 

Z G w) ^ z € AP 

In particular, we have a rewrite rule 

2 € fz,(zGx)^±{w) ^ z G w A {{z G z) ^ ± 

and hence writing C for the proposition fz,{z&z)^±{w) G fz,{z^z)^±{w), D for 
the proposition _L and E for the proposition fz,(zGz)^±i'>x) G w we have 

C^EA{C^D) 

In these examples the rewriting system itself is not terminating, as R (resp.) C 
reduces to a proposition where it occurs. We conjecture that this non termination 
is responsible for the non termination of reduction of proofs. 

Conjecture 1. If 7?. is a confluent and normalizing rewrite system (resp. class 
rewrite system), then proof reduction modulo TZ is normalizing. 

An obvious consequence is that deduction modulo TZ is consistent, by theo- 
rem 1. 

4 Proving Normalization 

Now we want to prove some particular cases of the conjecture. First that proofs 
normalize for the definition of higher-order logic given above. Then, that proofs 
normalize for all rewrite systems reducing terms to terms and atomic proposi- 
tions to quantifier- free propositions (as in the simplification example above). At 
last, that proofs normalize modulo all positive rewrite systems i.e. ones in which 
the right hand side of each rewrite rule contains only positive occurrences of 
atomic propositions. 

We first define a notion of pre-model and prove that when a congruence 
bears a pre-model then proofs normalize modulo this congruence. Then we shall 
construct premodels for our particular cases. 

4.1 Reducibility Techniques 

The basic tools used hereafter are the ones of reducibility proofs, whose main 
concepts are due to Tait [12] and Girard [6,7]. In particular, since we want to 
treat the case of higher-order logic, we need some form of reducibility candidates. 
We here take a definition similar to [7], but other ones like Tait’s saturated sets 
would also apply. 

Definition 4 (Neutral proof). 

A proof is said to be neutral if its last rule is an axiom or an elimination, 
hut not an introduction. 
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Definition 5 (Reducibility candidate). 

A set R of proofs is a reducibility candidate if 

— if TT € R, then tt is strongly normalizable, 

— if TT € R and tt — > tt' then tt' G R, 

— if TT is neutral and if for every tt' such that tt — tt', tt' G R then tt G R. 

Mostly, we follow the main scheme of reducibility proofs. That is we try, for 
every proposition P, to exhibit a set of terms TZp such that: 

— All elements of TZp are strongly normalizing. 

— If r \- t : P holds, then t G TZp (that is modulo some closure condition w.r.t. 

substitution) . 

The first condition is ensured by verifying that all TZp are reducibility can- 
didates. The second one is proved by induction over the derivation of T h t : P 
using closure conditions due to the definition of TZp. Typically, for instance: if 
TT G TZa^b then for each proof tt' element of TZa, (tt tt') is an element of TZb- 

Most important, and like for other calculi with dependent types, we will need 
the condition that if A = B then TZa = TZb ■ 

The closure condition above for TZa^b can be viewed as a partial defini- 
tion; the situation is similar for the other connectors and quantifiers. Thus, we 
understand that the crucial step will be to choose the right sets for TZa in the 
case where A is an atomic proposition (that is potentially a redex w.r.t. TZ). 
In other words, to define the family TZa, it will be enough to define for each 
predicate symbol P the sets or equivalently to give, for each n-ary 

predicate symbol P, a function P that maps n-uples of terms to some well-chosen 
reducibility candidate. 

It is well-know that a reducibility proof essentially boils down to the construc- 
tion of a particular syntactical model. This comparison is particularly striking 
here since, in first-order logic, to define a model, we also need to provide, for 
each predicate symbol P a function P that maps every n-tuple of terms to a 
truth value. 

We can pursue this comparison. If two terms ti and t'^ are congruent then 
the sets P(ti,...,t„) and P{t'i, ...,tn) must be identical. The function P is then 
better defined as a function from an abstract object (for instance, the class of p 
and t'l) that ti and t'^ denote. 

Then the condition that two congruent propositions must have the same 
denotation can be expressed as the fact that the rewrite rules are valid in the 
model. 



4.2 Pre-model 

Formalizing the discussion above, we end-up with the following notion. 

Definition 6 (Pre- model). Let C he the set of all reducibility candidates. 

Let C he a (many sorted) first-order language. A pre-model for C is given by: 
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— for each sort T a set Mt, 

— for each function symbol f (of rank (si, s„, s ') ) a function 

— for each predicate symbol P (of rank (si, s „) ) a function 

p g ^Ms^x...xMs„ 

Definition 7. Let t be a term and tp an assignment mapping all the free vari- 
ables oft of sort T to elements of M.t- define the object \t\^ by induction 
over the structure oft. 

-\x\^=g}{x), 

Definition 8. Let A be a proposition and ip an assignment mapping all the free 
variables of A of sort T to elements of Aix- We define the set \A\^ of proofs by 
induction over the structure of A. 

— A proof t: is an element of |-P(ti, ■ ■ ■ , tn)\(p if it belongs to , \tn\(p) 

(and is thus strongly normalizable). 

— A proof n is element of \A if it is strongly normalizable and when tt 

reduces to a proof of the form Aotti then for every tt' in \A\^p, [7r'/o;]7ri is an 
element of \B\^. 

— A proof 7T is an element of \A A B\^ if it is strongly normalizable and when 
TT reduces to a proof of the form (tti, tt 2 ) then tti and 7T2 are elements of \A\^ 
and \B\^. 

— A proof 7T is an element of \ A \/ B\tp if it is strongly normalizable and when 
7T reduces to a proof of the form i{TTi) (resp. j{TT2) ) then tti (resp. ^ 2 ) is an 
element of \A\^ (resp. \B\^). 

— A proof TT is an element 0 / |_L|(^ if it is strongly normalizable^. 

— A proofs is an element of |Va; if it is strongly normalizable and when n 
reduces to a proof of the form Xxtti then for every term t of sort T ( where T 
is the sort ofx) and every element E of Mt, the proof [t / x]tti is an element 

— A proof TT is an element of \3x Aj,p if it is strongly normalizable and there 
exists an element E of Mt (where T is the sort of t) such that when n 
reduces to a proof of the form (t, tti), then tti is an element of |^|i^+(x,£;)- 

Looking at the two last clauses of this definition, we may notice that no cor- 
relation is required between the interpretation of the proof variables p and the 
instantiations of object variables. This simplifies the proof, and is possible since 
instantiating object variables in proof terms does not create new (proof-)redexes. 
This is somewhat similar to the situation in typed A-calculi, where the substitu- 
tion of type variables does not create redexes in terms (see [7] for instance). 



^ As usual, we could chose about any other reducibility candidate for the definition of 

|T|^. 
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Definition 9. A pre-model is a pre-model of = if when A = B then for every 
assignment (f, \A\^ = \B\^. 

The following usual conditions are easily proved by induction over the propo- 
sition A\ the two last ones require a little more case analysis. 

Proposition 2. For any proposition A, term t, variable x and assignment ip: 

— If TT is an element of \A\^ then tt is strongly normalizable. 

— If IT is an element of \A\^ and tt ^ tt', then tt' is an element of \A\^. 

— Ifn is neutral and if for every tt' such that tt — tt' , tt' e \A\y, then tt G \A\ip. 

^From the three last properties, we deduce: 

Lemma 1. For every proposition A and assignment (p, \A\^ is a reducibility 
candidate. 



4.3 The Normalization Theorem 

We can now prove that if a system has a pre-model then proofs modulo this 
system normalize. The proofs of this section are a little long and tedious but 
bears no essential novelty. We omit them for matter of space and again refer 
to [4] for details. 

Theorem 2. Let A be a proposition and tt a proof of A modulo =. Let 6 be a 
substitution mapping the free variables of sort T of A to terms of sort T, tp be an 
assignment mapping free variables of A to elements of M.t o,nd a a substitution 
mapping proof variables of propositions B to elements of \B\^p. Then uOtt is an 
element of \A\^. 

Corollary 1. Every proof of A is in |A |0 and hence strongly normalizable 



4.4 Pre-model Construction 

Constructing the pre-model for a given theory, is the part of the consistency 
proof that bears the logical complexity; i.e. it is the part of the proof that 
cannot be done in the theory itself. The construction for HOL follows essentially 
the original proof. The two other ones we present are more typical of deduction 
modulo. 

Proposition 3. Higher-order logic has a pre-model, hence proofs normalize in 
higher- order logic. 

Proof. We construct a pre-model as follows. The essential point is that we an- 
ticipate the fact that objects of sort o actually represent propositions, by in- 
terpreting them as reducibility candidates. Thus quantification over o becomes 
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impredicative in the model. 

Mo = C 5 = 01 — a(c)(6(c)))) 

M-i = {0} K = a 1 -^ {b 1 -^ a) 

Mt^u = M^^ a{a,b) = a{b) 

e(o) = a 

=»(o, 6) = {tt € SM\tt — S-* Aotti =i> Vtt' G a.[7r'/a]7Ti G b} 

A(o, 5) = {tT € 5A/’|7T — !•* (tTi, 7T2) 7Ti e a a 7T2 S 5} 

V(o,6) = {tT G 5A/’|(7T i(7Tl) 7Tl € o) A (tT i{T^2) => 7T2 S 5)} 

± = SN 

yria) = {tt G 5A/’|7t — *■* Xxtti Vt : T.Vi? G 7WT-[t/a;]7ri G a{E)} 

3t(o) = {tt g 5A/’|3i? G Mt-tt — *•* (tiTTi) => tti G o(i?)} 

We do not detail the proof that if ^ = i? then \A\e = \B\g. 

In this last case, it is the presence of the quantifier V on the right hand part 
of one of the rewrite schemes that is responsible for the impredicativity of the 
resulting logic. We can give a generic proof of cut elimination for the predicative 
case: 

Proposition 4. A quantifier-free confluent terminating rewrite systems has a 
pre-model, hence proofs normalize modulo such a rewrite system. 

Proof. To each normal closed quantifier-free proposition A, we associate a set of 
proofs di'{A). 

E{A) = SAf if A is atomic 

E{A S) = {tt G iSAflTT Aof.TTi => Vtt' G <f'(A).[7r'/a]7ri G E{B)} 

W{A A B) = {n G iSAflTT (tti, 712 ) => tti G '1'{A) A 7T2 G \P{B)} 

W{A V B) = {tt G iSAflTT *(7Tl) => 7Tl G '1'{A) A TT i{TT2) 7T2 G \P{B)} 

•f(_L) = SAf 

Then we define a pre-model as follows. Let Afr be the set of normal closed 
terms of sort T. 

f{ti,...,tn) = f{h,...,tn) i 

P{ti,...,tn) =E{{P{ti,...,tn)) i). 

where let t [ (resp. A |) stand for the normal form of t (resp. A). 

Again, we leave out the proof that \i A = B then \A\g = \B\g. 

Remark 3. In this normalization proof we use the fact that some sets are re- 
ducibility candidates, but we never quantify on all reducibility candidates, re- 
flecting that fact that we here deal with predicative systems. 

Finally, since the interpretation \A\,p of an arbitrary proposition A is deter- 
mined by the choice of the interpretation for normal atomic propositions, it is 
tempting to define the latter by a fix-point construction. This is possible if the 
rewrite system induces a monotone interpretation function. 
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Definition 10. Let TZ he a terminating and eonfluent rewrite system, rewrit- 
ing atomic to non-atomic propositions. This system is said to be positive if the 
right hand side of each rewrite rule contains only positive occurrences of atomic 
propositions. 

Proposition 5. A positive rewrite system hears a pre-model, and thus the in- 
duced deduction system enjoys proof normalization and consistency. 

We again refer to [4] for details about the fix-point construction of the pre-model. 

5 Conclusion 

We have defined generically a wide range of deductive systems. Every system 
is defined by a given rewrite system over first-order propositions. We have seen 
that the systems so defined go further than first-order logic. 

We conjecture that simple combinatorial conditions on the rewrite system 
imply the proof elimination property and thus logical consistency. This conjec- 
ture implies the consistency of Church’s higher-order logic. It is also interesting 
to remark that, provided this conjecture holds, its logical strength is not yet 
clear. In other words, we do not know which is the strongest logical system de- 
finable as a deduction modulo. We have seen though, that naive attempts to 
encode set theory do not succeed. 

In any case, it seems that studying rewrite systems from their logical prop- 
erties is a new, promising and interesting subject. 



Acknowledgements 

We thank an anonymous referee for pointing out an error in the normalization 
proof, and for helpful comments. 



References 

1. S. Boutin, Reflexion sur les quotients. Doctoral thesis, Universite de Paris 7 (1997). 
63 

2. T. Coquand and G. Huet, The Galculus of constructions. Information and Com- 
putation, 76 (1988) pp. 95-120. 63 

3. G. Dowek, Th. Hardin and G. Kirchner, Theorem proving modulo, Rapport de 
Recherche INRIA 3400 (1998). 63 

4. G. Dowek, B. Werner, Proof normalization modulo. Rapport de recherehe INRIA 
3542 (1998). 64, 74, 76 

5. J. Ekman, Normal proofs in set theory. Doctoral thesis, Ghalmers university of 
technology and University of Goteborg (1994). 70 

6. J. Y. Girard, Interpretation fonctionnelle et elimination des coupures dans 
I’arithmetique d’ordre superieur. These de Doctorat d’Etat, Universite de Paris 
VII (1972). 71 



Proof Normalization Modulo 



77 



7. J. Y. Girard, Y. Lafont and P. Taylor. Proofs and Types, Cambridge University 
Press (1989). 71, 73 

8. L. Hallnas, On normalization of proofs in set theory, Doctoral thesis, University of 
Stockholm (1983). 70 

9. P. Martin-L6f, Intuitionistic type theory, Bibliopolis, Napoli (1984). 63 

10. Ch. Paulin-Mohring, Inductive definitions in the system COQ, Rules and Proper- 
ties, Typed Lambda Calculi and Applications, Lecture Notes in Computer Science 
664 (1993) pp. 328-345. 63 

11. G. Plotkin, Building-in equational theories Machine Intelligence, 7 (1972), pp. 73- 
90 63 

12. W. W. Tail, Intensional interpretation of functionals of finite type I, Journal of 
Symbolic Logic, 32, 2 (1967) pp. 198-212. 71 



Proof of Imperative Programs in Type Theory 



Jean-Christophe Filliatre* 



LRI, URA CNRS 410, Bat. 490, Universite Paris Sud 
91405 ORSAY Cedex, France 
Jean-Christophe . FilliatreOlri .f r 
WWW. Iri . fr/~f illiatr 



Abstract. We present a new approach to certifying functional programs 
with imperative aspects, in the context of Type Theory. The key is a 
functional translation of imperative programs, based on a combination 
of the type and effect discipline and monads. Then an incomplete proof 
of the specification is built in the Type Theory, whose gaps would corre- 
spond to proof obligations. On sequential imperative programs, we get 
the same proof obligations as those given by Floyd-Hoare logic. Com- 
pared to the latter, our approach also includes functional constructions 
in a straight-forward way. This work has been implemented in the Coq 
Proof Assistant and applied on non-trivial examples. 



Introduction 

The methods for proving programs developed in the last decades (see [4] for a 
survey), based on Floyd-Hoare logic [8,6] or on Dijkstra’s calculus of weakest 
preconditions [5], certainly experienced a great success. But the specification 
languages involved were usually low expressive first-order predicate logics which 
surely contributed to their relative failure in real case studies. More recent meth- 
ods try to fill this gap, as for instance the B method [2], whose specification 
language includes a rather large part of first-order set-theory. Type Theory also 
provides expressive logics, well understood in theory and relatively easy to im- 
plement since they are based on a small set of rules. The Calculus of Inductive 
Constructions (Cic for short) is one of the most powerful logical framework of 
this kind, and the Coq Proof Assistant [1] is one of its implementations. Type 
Theory is naturally well-suited for the proof of purely functional programs. We 
show that it is also a good framework to specify and prove imperative programs. 

There are already some formalizations of Floyd-Hoare logic in higher order 
logical frameworks, as for instance the work of T. Schreiber in LEGO [15] or 
M. Gordon in HOL [7]. Nevertheless, those formalizations still have the disad- 
vantages of Floyd-Hoare logic: well-understood on small imperative languages, 
they appeared to be difficult to extend to real programming languages. For exam- 
ple, the few base datatypes (integers, booleans, . . . ) are usually not sufficient and 
the programmer quickly has to construct new datatypes using arrays, records, 

* This research was partly supported by ESPRIT Working Group “Types”. 
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pointers or a primitive notion of recursive datatypes. Then the extensions of 
Floyd-Hoare logic become painful (see [4], page 931). 

For all those reasons, we think that we have to start with a more realistic 
language and to propose a more extensible method. We chose to consider a pro- 
gramming language with both functional features (functions as first-class objects, 
higher-order functions,. . . ) and imperative ones (references, while loops,. . . ). The 
base objects will be the ones of the Type Theory, which gives immediately a huge 
panel of datatypes (for instance lists, trees,. . . defined as inductive types in the 
Cic). References will be limited to purely functional values. Notice that such a 
language already includes FORTRAN, Pascal without an explicit use of pointers, 
and a rather large subset of ML. 

In the traditional approach of Floyd-Hoare logic, programs are seen as state- 
transformers, where a state maps variables to values. Then the total correctness 
of a program M can be expressed in the following way: 

{P} M {Q} = Vs.P(s)^3s'. ([M](s,s')AQ(s,s')) 

where |M] is the semantic interpretation of M as a relation between input and 
output states. It is easy to define the type of these states when they only contain 
integers and booleans for instance. But this becomes difficult as soon as states 
may contain objects of any type. Moreover, in real case studies one has to express 
in post-conditions that some parts of the state are not modified by the program. 
This is not natural. 

Thinking of a variable x as an index in a global store is very close to the 
implementation of imperative languages, where a; is a pointer in the heap. It is 
necessary when there is possible aliasing in programs i.e. when two variables 
may point to the same object. But in practice, most programs considered do 
not contain aliased variables. Then we can directly represent the contents of 
a variable x of the program by a variable of the logic. Predicates about the 
program’s variables become predicates about the logic’s variables, and not about 
some accessed values in a global store. 

Consequently, we propose to express the semantics of an imperative pro- 
gram M by a functional program M taking as argument a tuple x of the values 
of the variables of M and returning a tuple y of the values of the variables 
(possibly) modified by M, together with the result v of the evaluation. Then, 
the correctness may be written 

{P} M {Q} = \/x.P{x) ^3{y,v). {{y,v) = M{x) AQ{x,y,v)) (1) 

The functional translation relies on a static analysis of effects, following 
J.-P. Talpin and P. Jouvelot’s Type and Effect Discipline [16], and on the use of 
monads [11,17]. 

Then we propose a method to establish the correctness formula (1). To do 
that, we construct an incomplete proof M of this proposition. By “incomplete” 
we mean that the proof term M still contains gaps, whose types are known. Those 
gaps will give the proof obligations. Our work may be seen as an extension of the 
work of C. Parent [13] to imperative programs. Her approach to proving purely 
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functional programs, based on realizability, consists in building an incomplete 
proof of the specification whose skeleton is the program itself. We extend this 
idea using a functional translation of imperative programs. 

This paper is organized as follows. In the first section, we introduce a pro- 
gramming language with functional and imperative aspects, and we define a 
notion of typing with effects for this language. This allows us to define a func- 
tional translation of programs, which is proved to be semantically correct. In the 
next section, we introduce annotations in programs (pre- and post-conditions, 
variants) and we define an incomplete proof associated to each program, whose 
proof obligations establish the total correctness when proved. In the third sec- 
tion, we shortly describe the implementation of this work, which is already part 
of the Coq Proof Assistant [1]. At last, we will discuss related works and propose 
some possible extensions. 



1 Effects and Functional Translation 

Preliminary definitions and notations. In this paper we are going to consider 
two different kinds of programs: purely functional ones and imperative ones. 
Let T be the type system of purely functional programs and P \- e : t the 
corresponding typing judgment, where T is a typing environment i.e. a mapping 
from variables to types. We assume that T contains at least the type of booleans 
and a type unit which has only one value, the constant void. The type system 
of imperative programs will be T^ef ■'■= T \ T ref, where T ref is the type of 
references over objects of type T. The corresponding typing judgment will be 
written P \~r e : t. Dereferencing will be written \x and assignment x := e, as 
usual in ML languages. 

The abstract syntax of programs is given in Figure 1. The nonterminal sym- 
bol E stands for a purely functional expression, including possible dereferenc- 
ing of variables. For the moment, we only consider call-by-value application, 
written (/ e) — call-by-name will be considered at the end of this section. To 
simplify the presentation, we chose to restrict boolean expressions appearing in 
if and while statements to be purely functional expressions. The precise meaning 
of V in the abstraction fun {x : V) ^ M will be explained in the next paragraph. 



1.1 Effects 

The functional translation of imperative programs that we are going to define 
relies on a keen analysis of their effects. We do not intend to present new ideas 
or results to the existing theory of static analysis and its application to effects 
inference. Actually, we are only interested in determining the sets of variables 
possibly accessed or modified by a given program. In the general case, this would 
require a complex analysis of regions^ as defined by J.-P. Talpin and P. Jouvelot 
in [16], or some similar technique, but since we have eliminated alias possibilities 
in our programs, the solution is here much simpler. 
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M 



E 

X \= M 
M ■ M 

if E then M else M 
while E do M done 
let a; = ref M in M 
fun (x ■. V) ^ M 
{M M) 



Fig. 1. Abstract syntax of programs 



The effect e of a program expression will be a pair {R, W) of two sets of 
variables, R being the set of all the references involved in the evaluation of the 
expression and W the subset of R of references which are possibly modified 
during this evaluation. If ei = (Ri,Wi) and €2 = (i? 2 , 1 ^ 2 ) are two effects, then 
£i U £2 will denote their union i.e. the effect (i?i U i? 2 , bPi U W 2 ). Finally e\x will 
denote the effect obtained by removing the occurrences of x in the effect e. 

To do type inference with effects, we introduce a new type system composed 
from a type system for values^ V, and a type system for computations, C. They 
are mutually recursively defined as follows: 



In definition (2), we express the fact that functions are first-class values and that, 
since we chose call-by-value semantics, functions take values as arguments to 
produce computations. In definition (3), we express that a computation returns a 
value together with an effect. Those definitions are clearly driven by the semantic 
and would have been different with call-by-name semantics. 

To type programs with effects, we must now consider environments mapping 
variables to types of values, i.e. to expressions of the type system V. If E is such 
an environment, then R \- M : C will denote the typing judgment expressing 
that a program M has the type of computation C. This judgment is established 
by the inference rules given in Figure 2. It is clear that this judgment is decidable 
and that the type of computation of a program is unique. 

1.2 Functional Translation 

The functional translation we are going to define is based on the idea of monads. 
Monads were introduced by E. Moggi [11] and P. Wadler [17], in rather different 
contexts. They are used by Moggi to give semantics to programming languages. 
The motivations of Wadler are more pragmatic and closest to ours: he uses 
monads to incorporate imperative aspects (stores, exceptions, input-output, . . . ) 
in purely functional languages. A monad is composed of a type operator p, and 



C 



V 



T IT ref I V^C 
(V,e) 



( 2 ) 

( 3 ) 
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r \~r E : T R = \ X € dom(r) \ \x in E} , 

(EXP) 

Eh E:{T, (R,0)) 

x:T refer E h M : (T, (R, IV)) 

^ ^ — (assign) 

r I- a; := M : (unit, ({a;} U i?, {x} U VF)) 

ThMi : (unit,6i) EhM2:(V,e2) 

( SEQ ) 

r h Ml ; M2: (V, (ei U £2)) 
rhS: (bool,eo) E h Mi : (V, ei) E h M2 : (V, £2) , 

(COND) 

_T h if _B then Mi else M2 : (V, (eo U ei U £2)) 

r h B : (bool,£o) r h M : (unit,£) , 

(loop) 

E h while B do M done : (unit, (£q U £)) 

E h Ml : (Ti, £1) E,x:Ti ref h M2 : (T2, £2) 
r h let a; = ref Mi in M2 : (T2, (£1 U £2)\ai) 

E,x:V'r M:C , , 

(abs) 

B h fun (a; : y) ^ M : ((F ^ C), (0, 0)) 



BhMi : ((y ^ (yi,£i)), £o) BhM2:(y,£2), , 

(APP) 

Bh (Ml M2) : (yi,(£oU£iU£2)) 

Fig. 2. Typing with effects 



two operators 

unit : A — > ^{A) 

star : fJ.{A) ^ (A ^ ^ 

satisfying three identities (which it is not necessary to give here). The main idea 
is that n{A) is the type of the computations of type A. The unit operator is an 
injection of values into computations. The star operator takes a computation, 
evaluates it and passes its result to a function which returns a new computation. 
If we consider references, a possible monad is the one defined by p.{A) = S 
S X A, where S represents the store, and where unit and star are defined by 

unity = Xs.{s,v) 

star m f = As. let (s',v) = (m s) in (/ v s') 

In our case, such a monad is too coarse, mainly because a global store does not 
allow proofs of programs to be modular (you have to express that some parts of 
the store are left unchanged by the program, which is not natural and painful). 
So we use local stores i.e. tuples of values directly representing the values of input 
and output variables given by the inference of effects. For instance, a computation 
involving the references x and y of type int ref which can modify the value of x 
will be translated into a function taking the values of x and y as input and 
returning the new value of x together with the result of the computation. 
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As noticed by Moggi, the star operator really acts as the let in operator 
of ML. In the following we will use a let in notation to make programs more 
readable, but the reader should have in mind the use of the monad operator to 
understand the generality of the discourse. 

Notations. To define the functional translation of programs, we need to manip- 
ulate collections of values and in particular to define functions taking an n-tuple 
representing the input of a program and returning an m-tuple representing its 
output. Instead of using tuples, we will rather use records which are easier to 
manipulate. The type of a record will be written {a;i : Xi; ...; Xn ■ Xn } and a 
particular record of that type will be written {x\ = vi] . . . ; }, where Xi 

is a label, Xi its type and Vi a value of that type. Records will be written x,y, . . . 
for convenience and x.l will denote the field of label I in record x. We define the 
operation 0 on records as follows: the record x (B y = { z\ = wi; . . . ; Zk = Wk} 
contains all the labels of x and y, and Wi is equal to y.Zi if Zi is a label of y 
and to x.Zi otherwise. In other words, it is the update of the record x by the 
record y i.e. the record made with the fields of y when they exist and of x in the 
other case. At last, x\l will denote the record x in which the field I is removed 
and x[l ^ I'] will denote the record x where the label I is renamed into I'. 

First, we give the interpretation of types with effects as purely functional 
types. 

Definition 1. The functional translation of types of values and types of com- 
putations are mutually recursively defined as follows: 



values 


T 


= T 






V^C 


= v^. 


> C 


computations 


{V,{R,W)) 


= 


■ W X R 


effects 


{Xi, . . .,Xn} 


= {a;i 





where Xi has type Ti ref 

If r is a typing environment mapping the Xi ’s to the types Vi, then T will denote 
the environment mapping the Xi ’s to the types Vi . 

Note, there is no translation for the types T ref: indeed, there is no program 
expression of such type and therefore there is no counterpart in the functional 
world. References, seen as pointers, do not exist anymore after the translation 
since we only manipulate the values they contain. 

We are now in the position to define the functional translation itself. In the 
following we will use a slight abuse of notations: we will write (M x) even when 
the record x should be restricted to a subset of its fields (Another possibility is 
to consider that we have sub-typing on records). 

Definition 2. Let M he a program of type C in a context T. Then the functional 
translation of M , written M , is a functional program of type C in the context T, 
which is defined by induction on the structure of M as follows: 
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- M = E : 

— M = Xq := Ml : 



M = Aa:.({}, ^ a;. a;]) 



M = Acc.let (jci, ti) = (Ml x) in (a;i © { xq = }, void) 



- M = Ml ; M2 : 



M = Xx. let (aji, _) = (Mi x) in 

let (x2,v) = (M2 (x © xi)) in 
(xi © X2,v) 

— M = \^ E then Mi else M2 : 



M = \x. if {E x) then 

let (cci, v) = (Ml x) in {x © Xi, x) 
else 

let (xi, v) = (M2 x) in {x © £Ci, x) 

It is clear now why we impose the condition on output variables to be in- 
cluded in the input variables: indeed, in both branches of the if we must 
return values for the same set of variables. So, when a variable is not modi- 
fied by a branch, we have to return its initial value: therefore it must belong 
to the input values. 

— M = while E do Mi done : 

To translate a loop, we use the following semantic equivalence: 

M Ki {Y (Aw : unit ^ unit. if E then (Mi ; {w void)) else void) void) 



where T is a fixpoint operator. Then 

M = {Y Xw : C .Xx. if {E x) then 

let (xi, -) = (Ml x) in (u> {x © Xx)) 
else 

(a;, void)) 

Here we assume that we have a fixpoint operator Y in the Calculus of Con- 
structions, which is not the case usually. This need will disappear when we 
are in position to establish termination, as it is the case in the next section. 

— M = let Xq = ref Mi in M2 : 

M = Xx. let (xi, vi) = (Ml x) in 

let {x2,V2) = (M2 (a: © Xi © { xo = ni })) in 
(a;i © (a: 2 \xo),X 2 ) 

— M = fun {x : V) ^ Mi : 

M = Ax : V.M[ 
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- M = (Ml M 2 ) 

M = Xx. let (a;i, a) = (M 2 x) in 

let {x 2 , /) = (Ml {x 0 ®i)) in 
let {xs, v) = {f a {x (B x-i (B X 2 )) in 

(xi 0 X2 0 X3,v) 

Notice that we chose a semantics where the function is evaluated after its 
argument; therefore multiple arguments are evaluated from right to left. 

□ 

To justify this definition, we have to prove that M preserves the semantics 
of M. We chose the formal semantics introduced by A. K. Wright and M. Felleisen 
in [18] to prove type soundness of SML with references and exceptions. The idea 
is to introduce a syntactic distinction between values and expressions and to 
extend the syntax of programs with a new construction pO.M, where 0 is a map- 
ping from variables to values representing the store. Then small-step reductions 
are introduced on extended programs, as rewriting rules driven by the syntax, 
and the evaluation relation M ^ p9' .v is defined as the transitive closure of 
those reductions. 

The preservation of the semantics can be expressed by the following theorem: 

Theorem 1. Let F be a well-formed environment whose references are 
xi : Ti ref, ...,x„ : T„ ref and M a program such that F \- M : {V,{R^W)). 
Let Vi be values of types Ti and 9 the store mapping the Xi ’s to the Vi ’s. Then 

p9.M p9'.v <S=^ M{9{R)) = {9'{W),v) 

Proof. We won’t enter here the details of this proof. The if part is proved by in- 
duction on the derivation of the evaluation relation and the only if part is proved 
by induction on the program M. In both cases the proof is rather systematic 
since the semantic relation is driven by the syntax of programs. The reader may 
consult [18] to get the formal semantics of ML with references, which is greatly 
simplified in our case since we do not have polymorphism. □ 

Cal 1- by- Var iab le 

Until now, we did not consider call-by-variable arguments i.e. the passing of 
references to functions, because their treatment is rather complex when com- 
bined with partial application (it requires the use of painful explicit coercions to 
re-organize elements in tuples) . But if we do not allow partial application of func- 
tions doing side-effects, we can simply deal with call-by-variable. To give an idea 
of this method, let us consider the simple case of a function M taking a first argu- 
ment by- value and a second one by- variable i.e. of type Ui ^ (x : T 2 ref) — > (T, e). 
Its second argument is given a name, x, since it may appear in e. Then the typing 
rule for the application (M Mi z) is the following 

r h M : ((Ui ^ (x : T 2 ref) ^ (T, e)), cq) T h Mi : (Vi, d) z : T 2 ref G F 
F h (M Ml z) : T, (cq U ei U e[x <— z]) 
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During the functional translation, the second argument of M disappears: indeed, 
the functional program M does not manipulate references, but only their values, 
and consequently M has type V\ ^ R ^ W xT, where e = {R,W). The 
reference x now appears as a field of R and W. So the functional translation of 
the application (M Mi z) will be 

Aa;. let (a;i,ui) = (Mi x) in 

let {x 2 , /) = (M {x 0 a;i)) in 

let (* 3 , v) = (/ v\ (a; 0 a;i © X 2 )[z ^ a;]) in 

{xi © a;2 0 a;3[x ^ z], u) 

This example is easily generalized to arbitrary numbers of call-by-value and 
call- by- variable arguments. 

2 Program Correctness 

We now come to the main point, proofs of programs, and the first thing to 
do is to define how to specify the programs. In almost all the literature about 
Floyd-Hoare logic and related systems, the programs are purely imperative and 
therefore can be seen as sequences of statements with separated notions of state- 
ments and expressions. Typically we have an abstract syntax of the kind 

S ::= skip \ x := E \ S ; S' | if if then S else S | while if do S done 

where if is a pre-defined notion of expressions. Then it is natural to specify them 
by inserting logical assertions between the successive evaluations i.e. between the 
statements. 

But in our case, following the tradition of functional programming languages, 
the notions of programs and expressions are identified. For instance, we can write 
programs like 

X := (x := la; + 1 ; la;) 

X := (if B then . . . else . . .) 

This may appear superfluous but we claim that this is the key to deal with 
functions without difficulty. Take for instance a statement like 

X := (/ a) 

where / is a function which possibly has side-effects. Since our abstract syntax 
already includes statements of the form x := M where M is an arbitrary pro- 
gram, the above assignment does not change the effects. In other words, inlining 
the function / would give a correct program, which is not the case in traditional 
frameworks. 

Since the notions of programs and expressions are identified, we propose a 
more general way to specify programs, where each sub-expression of a program 
may be annotated with a pre- and a post-condition. A pre-condition will be a 
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predicate over the current values of variables, as usual. For the post-conditions, 
we will use before-after predicates i.e. predicates referring to the current values 
of variables and to their values before the evaluation of the expression, as it is 
done in VDM. The current value of a variable x will be written x in both pre- 
and post-conditions and its value before the evaluation will be written iF in a 
post-condition. Moreover, a post-condition must be able to express properties of 
the result of the evaluation, which will be given a name in post-conditions. The 
abstract syntax of programs is extended as follows: 

M ::= {P} 5 {u I Q} 

S ::= E \ X := M \ M ; M \ .. . (as formerly) 

where {P} stands for a pre-condition and {a; | Q} for a post-condition in which 
the result is referred as x. The type systems V and C for values and computations 
are enriched consequently: 

V ::= T I T ref l(x : V) ^ C (4) 

C ::= (v.V,e,P,Q) (5) 

Note that function arguments are given names since they may now appear in 
annotations. 

As previously, we have to give first an interpretation of types in the target 
language, the Calculus of Inductive Constructions. Those interpretations are 
written V and C, C being the correctness formula. A first idea for C could be 

\/x.P{x) ^ 3{y,v).{y,v) = M{x) AQ{x,y,v) (6) 

But M is usually not a total function and therefore is not definable in the Cic. 
Thus, we choose to define C just as 

Wx. P{x) ^ 3(y, v).Q{x, y, v) (7) 

and we will construct a particular proof of (7), M, which has the property to have 
a computational contents equal to M . Therefore, the realizability theorem [14] 
will exactly express (6), which is the expected result. 

Definition 3. The interpretation in CiC of the types of values and the types of 
computations are defined as follows: 

if V = T then V = T ^ 

if V = {x : Vq) — > C then V = Wx : Vq.C 

if C={r:V^{R,W),P,Q) _ 

then C = \/x : R. P{x) 3(y, v) :W xV . Q{x, y, v) 

If P is a typing environment mapping the Xi ’s to the types Vi, then P will denote 
the environment mapping the Xi ’s to the types Vi . 
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Before giving the translation of the programs themselves, we have to solve a 
last problem: since functions in the Cic are necessarily total, a loop cannot be 
simply translated using a general fixpoint operator. Moreover, we are interested 
in proving total correctness and therefore we have to justify the termination of 
loops. Thus, loops will be annotated with a well-foundedness argument: 

while,^_i{ E do M done 

where (/> is a term and R a relation over the type of (j). Then such a loop will be 
translated using a well-founded induction over (j). 

In the following, an incomplete proof term of the CiC is a term where some 
sub-terms are still undefined and written “?”. We assume that those gaps are 
typed, but we will not write the corresponding types for a greater clarity. 

Definition 4. Let M he a program of type C in a context E . Then the inter- 
pretation of a program M in the CiC, written M , is an incomplete proof term 
of type C in the context T , which is defined by induction on the structure of M 
as follows: 

- M = {P}E{Q} : 

M = Xx.Xh : P.({}, E^.x <— x.x],7) 

- M = {P} Xq := Ml {Q} : 

M = Xx.Xh : P.let (cci, f , gi) = (Mi x ?) in (jci 0 { a;o = }, void, ?) 

- M = |P} Ml ; M 2 IQ} : 

M = Xx.Xh : P. let (cci, _, qi) = (Mi x ?) in 

let {x 2 ,v, 52 ) = (M 2 {x 0 xx) ?) in 

{Xx 0 X2,V,1) 

- M = {P} if E then Mi else M 2 {Q} : 

M = Xx.Xh : P. if {E x) then 

let (cci, ri, q) = (Mi x 1) in (a; 0 a;i, n, ?) 
else 

let {xx,v, q) = (M 2 X ?) in (x 0 Xx,v, ?) 

- M = {P} while,^_fl E do Mi done {Q} : 

To construct the proof term corresponding to this loop, we are going to use 
a well-founded induction over (j). The well-founded recursor is a higher- 
order term taking a relation R and a proof tt that R is well-founded. In our 
case, the proof tt is replaced by a proof obligation. 

M = Xx.{{Yh 7 Xw.Xx.Xtjj.Xh : P. 

if {E x) then 

let (jci, q) = (Ml X ?) in {w (x 0 a;i) (j){x 0 Xx) ? ?) 
else 

{x, void, ?)) X (j){x)) 
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Here we have used P as the loop invariant. There are four proof obligations 
related to that loop: to prove that R is well-founded; to prove that (p strictly 
decreases in Mi; to prove that M\ preserves P; and to prove that P implies Q 
when the test is negative. 

— M = {P} let xo = ref Mi in M2 {Q} : 

M = Xx.Xh : P. let {xx,vi,qi) = (Mi x ?) in 

let (x2,V2,q2) = {M2 {x ® x-i_® {xo = vi }) ?) in 
{xi 0 (a;2\a;o),U2,?) 

— M = {P} fun {x : V) ^ Mi {Q} : 

M = Xh-. P.{{Xx : V.Mi),7) 

Notice that here P and Q are the pre- and post-conditions of the function 
and not of the result of its application. Usually they are empty. 

— M = {P} (Ml M2) {Q} 

M = Xx.Xh : P. let {xi, a, qi) = (M2 x ?) in 

let {X2,f, 92) = {Ml {x 0 xi) ?) in 
let (cc3, u, ga) = {f a {x ® Xi ® X2) ?) in 
{Xi 0 CC2 0 Xs,V,7) 

□ 

To justify the validity of the proof obligations we obtain, we have to prove 
that they apply to the right values. It directly results from Theorem 1 and from 
the following proposition (which is immediate by construction of M): 

Proposition 1 . For any program M the following equality holds 

£{M) = M (8) 

where £ is the extraction operator, which computes the informative contents of 
a proof ( extraction in the Calculus of Inductive Constructions was introduced by 
C. Paulin in [14])- 

Examples and Comparison to Floyd-Hoare Logic 

Example 1. Let us consider the classical case of assignment where the right hand 
side has no effect and no annotation i.e. 

M = {P} xo := E {Q} 

Then the proof term M reduces to Xx.Xh : P{x).{x 0 { a;o = if }, void, ?) where 
the gap has type Q{x, x® {xo = E}, void). So we get only one proof obligation, 
which is the following, with some abuses of notation: 

P ^ Q[xq ^ E] 

This is exactly the one we get in Floyd-Hoare logic (with the combination of the 
consequence rule and the assignment rule). □ 
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Example 2. Let us consider again an assignment, but where the expression as- 
signed is now the result of a function call, that is M = {P} xq '■= (/ E) {Q}. 
The function / has a type V of the form Vq — > {v : Vi,e, Pf,Q f). Therefore, we 
can see the program M as annotated this way: 

M = {P} xo := {P/} (/ E) {v I Qf} {Q} 

Then if we look at the proof term M, we find two gaps, the first one correspond- 
ing to the pre-condition Pf and the second one to the post-condition Q. More 
precisely, the two proof obligations are the following: 

P Pf and P Pf ^ Q f ^ Q [xq < — u ] 

Contrary to the previous case, the expression assigned is no more substituted in 
the post-condition but abstracted through the variable v. □ 

3 Implementation 

This work has been implemented in the Coq Proof Assistant [1] and is currently 
released with the system, together with a documentation and a few examples. 
The user gives an annotated imperative program to a tactic, and a set of proof 
obligations are produced, which must be proved to complete the correctness 
proof. As explained at the end of the first section, we have call-by-variable but 
without partial application (so it is closer to Pascal than ML regarding this 
point). The implementation also includes some additional features, namely ar- 
rays and a way to eradicate the use of auxiliary variables. 

Arrays. When dealing with arrays, we face a potential source of aliasing prob- 
lems, since t[i] and t[j] may refer to the same cell of an array, while it is not 
possible to decide statically if i is equal to j or not. A standard solution is to 
consider the whole array as a mutable data, like any reference (see [4], page 931). 
Consequently, programs manipulating arrays become functions taking arrays as 
arguments and returning new arrays. Arrays are axiomatized in the Cic in the 
very simplest way, and proof obligations are produced to check that indexes are 
always within the bounds of arrays. 

The case of auxiliary variables. Auxiliary variables, sometimes called logical 
variables, are used in specifications to relate values of variables at different mo- 
ments of the execution of a program. Indeed, the before-after predicates used 
in post-conditions are not always powerful enough to express the specifications 
(see for instance [4], page 940). Although it is possible to give a formal inter- 
pretation to auxiliary variables, as shown by T. Schreiber in [15], we propose 
another way to solve this problem. The idea is very simple: since our functional 
translation define names for each new value of a variable (at each application of 
the monad let in operator), we just have to allow the user to access those names. 
So we added the possibility to put labels inside programs (like the ones used for 
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the goto statement in old programming languages), and the user may refer to 
the value of a given variable at a particular moment using the associated label. 
Then the substitution inside pre- and post-conditions is performed during the 
construction of M . This is a rather technical point, but it appeared to be very 
useful in practice, and the proof obligations generated are simpler to prove since 
they do not require rewriting. 

Case studies. The first case study we did with our implementation is the program 
find, a quite complex algorithm which was proved correct by Hoare using his 
axiomatic logic in 1971 [9]. Applied to that annotated program, our method 
generated exaetly the same proof obligations than the ones given in Hoare’s 
paper. We also did a correctness proof of the quicksort algorithm and of the 
Knuth-Pratt-Morris string searching algorithm. Another case study is a proof 
of insertion sort, which was done recently by a novel user with the Coq system. 
Those case studies cover all the functionalities of the method, including loops, 
procedures and recursive functions. 

4 Discussion and Futnre Work 

We have set out in this paper a method to establish the total correctness of im- 
perative programs in Type Theory. It shares the goals of traditional Floyd-Hoare 
logic, regarding programs annotations and proof obligations we expect to get, 
but this method uses a completely different approach, based on the construc- 
tion of an incomplete proof of its correctness, itself built on top of a functional 
translation of the imperative program. On sequential imperative programs, we 
get in practice the same proof obligations as Floyd-Hoare logic, which was the 
expected result. 

Compared to the B method of J.-R. Abrial [2], our method, even if it has not 
the same level of maturity, offers some benefits. First, it is based on Type Theory, 
a framework which is small, powerful and well-understood on a theoretical point- 
of-view. Secondly, we are able to express the correctness of a program and its 
proof in the logic, which is not the case in traditional methods where programs, 
their specifications and their proofs live in different worlds and where correctness 
depends on the correctness of the tools relating those parts. The B method 
was recently partly formalized in Isabelle/HOL by P. Chartier [.3]. But such 
a formalization has the same drawbacks as Floyd-Hoare logic formalizations 
already mentioned [7,15]: they may be useful to prove theoretical properties of 
the framework but can not be applied in practice because of painful encodings. 

Closest to our approach is the one of C. Munoz [12], who formalized the B 
abstract machines in PVS. As we do, he directly manipulates objects of the 
Type Theory, which extends the range of datatypes of B, and uses record de- 
pendent types to define the states of abstract machines (a collection of variables 
with a global invariant). Then operations are seen as functions on states, and 
the preservation of the invariant is expressed by typing, proof obligations being 
generated by the type-checking conditions mechanism of PVS. 
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Regarding future works, we would like to incorporate pattern-matching and 
exceptions to reach a more realistic language. While pattern-matching is only 
technical — semantically, it can be viewed as a combination of tests and ac- 
cess functions — the treatment of exceptions needs an extension of the notion 
of effects and of the specification. However, we think that our framework is 
well-suited for such an extension, since the only thing to do is to define the 
corresponding monad. 
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Abstract. This article presents a formulation of the fan theorem in 
Martin-Lof’s type theory. Starting from one of the standard versions of 
the fan theorem we gradually introduce reformulations leading to a final 
version which is easy to interpret in type theory. Finally we describe a 
formal proof of that final version of the fan theorem. 

Keywords: type theory, fan theorem, inductive bar. 



1 Introduction 

In informal constructive mathematics, the fan theorem is an easy consequence 
of the rule of bar induction. Both are about infinite objects which makes their 
interpretation in Martin-Lof’s type theory non trivial. Bar induction can be 
represented in type theory, as proposed in [Mar68] and shown also in this article. 
But still from this interpretation it is not clear how to formulate and prove the 
fan theorem formally in type theory. 

This is because, whereas the usual informal language to treat bar induction 
and the fan theorem is the same, the formal treatment of the fan theorem in type 
theory is technically more involved than that of bar induction. The concept of 
finiteness is difficult to handle simultaneously in an elegant, completely formal 
and constructive way; and it seems hard to avoid dealing explicitly with fans, 
whereas spreads are avoided in the type-theoretic interpretation of bar induction. 

The fan theorem is very important in constructive mathematics since it makes 
possible to reconstruct large parts of traditional analysis. For explanations of 
the fan theorem and its role in constructive analysis see for instance [Dnm77] 
and [TvD88]. 

The goal in this article is to present a formulation and a proof of the fan 
theorem in type theory. The type-theoretic version of the fan theorem presented 
here has been used in [Fri97] to interpret in type theory an intuitionistic proof 
of Higman’s lemma which uses the fan theorem [Vcl94]. However, in [Fri97] the 
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type-theoretic fan theorem is only mentioned and the proof is omitted. The 
importance of the fan theorem justifies this more extended presentation. 

Type theory here means Martin-Lof’s type theory, of which there exist dif- 
ferent formulations (for example, [Mar75], [Mar84], [NPS90] and [Tas97]). The 
exposition here should suit all of them. The proof of the fan theorem presented 
here has been written down in full detail with the assistance of the proof-editor 
ALF [Mag94] which is an implementation of the formulation of type theory given 
in [Tas97]. 

The rest of this article is organized as follows. Section 2 introduces some 
notations and definitions to be used in the whole article, and gives an informal 
presentation of bar induction and the fan theorem. 

Section 3 shows a type-theoretic interpretation of bar induction and illus- 
trates its use by proving some of its properties which are useful for the rest of 
the article. 

Section 4 formulates and proves the fan theorem in type theory. 

Finally, Section 5 presents a result by Veldman [Vcl98] related to the formu- 
lations of the fan theorem given in Section 2. 

The contributions of this article are the alternative informal formulations of 
the fan theorem in Section 2 and the formalization of the fan theorem in type 
theory, in Section 4. 



2 Bar Induction and the Fan Theorem 

This section introduces the notations to be used in the whole article and gives 
an informal presentation of bar induction and the fan theorem. Several refor- 
mulations of the fan theorem are introduced leading to Theorem 5, which is the 
version that is formalized in type theory in Section 4. 



2.1 Preliminaries 

Notations: 

Af the set of the natural numbers. Variables: n, m, k. 

A* the set of the lists (finite sequences) of elements of the set A. Variables: u, 
V, w. Even u, v, w when A is a set of lists. 

<«!, . . . , Qn> is the notation for lists. 

u*v is the concatenation between lists. 

uu a is a notation for concatenations of the form u* <a>. 

The variables a, (3 are used to denote infinite sequences of natural numbers. 
An initial segment <a(0), . . . , a{n—l)> of a is denoted a{n). Given a set S of 
finite sequences of natural numbers, if Vn [a{n) € 5], then we write a € S. We 
denote by 5'^ the set Af* \ S. 
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Definition 1 (tree). A tree is a set T of finite sequences of natural numbers 
(intuitively, a set of finite branches) which satisfy 

<> G T T is inhabited 

Vm [m G T V u ^ T] T is decidable 

Vm, n[M»nGT=>uGT] T is closed under predecessor. 

Definition 2 (finitely branching). A finitely branching tree is a tree T which 
satisfy 

yil G T 3m Vn [u • n G T =i>n<m]. 

Definition 3 (spread, fan). A spread is a tree in which every node has at 
least one successor, that is, a tree S satisfying 

Vm G 5 3n [U • n G iS] . 

A finitely branching spread is called a fan. 

Definition 4 (bar). Given a set lA C M* and a spread S, U is a bar on S if 

Vo G 5 3n \a{n) G U]. 

When S = Af* , S is called the universal spread and U is said to be a bar. 

Proposition 1. Given a spread S and a bar hi on S, then V =IA\J is a bar. 

We can prove that V is a bar by letting a be an arbitrary infinite sequence 
of natural numbers and finding n such that a(n) G V. To this end, we determine 
a sequence of natural numbers (3 whose initial segments are the same as those 
of a as long as they belong to 5. As soon as an initial segment of a does not 
belong to S, j3 deviates from a. From that point, the initial segments of (3 are 
arbitrary segments in S. That is, 

dfii = / ifa(f+l)G5 

^ \k if a(i+l) ^ 5, for some k such that P{i) • k G S 

As P £ S, and U is & bar on S, we can obtain n such that /3(n) G U. Now, 
either a{n) G S, in which case a{n) = P{n) £ U C V, or a(n) ^ S, hence 
a(n) G 5° C V. Therefore, V is a bar. 



2.2 Bar Induction 



Bar induction is the following rule, which is an axiom of intuitionistic logic 



Vu G A u£y 

Vu G A Vn [u • n G A] 

Vu { [Vn uun£y]^u£y} 
Vo 3n \a{n) G A] 

<>G 3^ 



A is included in y 
A is monotone 
y is hereditary 
A is a bar 



BI 



for A, 3^ C J\f* . For other formulations of the rule of bar induction and their 
justification see [Dmn77]. 
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2.3 Fan Theorem 

The most important consequence of the rule of bar induction is the fan theorem. 
Theorem 1 (fan theorem). Given a fan T , and a monotone bar hi on T , then 

3n Vo S T [a{n) € U]. 

Intuitively, the fan theorem states that for any finitely branching tree all 
whose branches are finite, there is an upper bound on the length of the branches. 
The tree, not explicit in the statement of the theorem, is the set T\U (when U 
is decidable and O^U). 

The fan theorem can also be read as stating that every finitely branching 
tree all whose branches are finite is itself finite, that is, has a finite number of 
nodes. This is so, since for a finitely branching tree, the existence of an upper 
bound on the length of the branches is equivalent with it being finite. 

A proof of the fan theorem can be obtained using the rule of bar induction 
with A = Uyj and 3^ = {u | 3n Va G IF [a starts with u a{n) G U]}. 
Proposition 1 guarantees that A is a bar. The monotonicity of A follows from 
those of U and . The inclusion of A in y can be proved by letting u G A be 
arbitrary and choosing n as the length of u. To prove that y is hereditary we 
assume that, for an arbitrary u,yk uu k £ y holds, and prove that u £ y also 
holds, liu ^ then u £ y clearly holds, since no a £ J- starts with u. Otherwise, 
as T is finitely branching there exists m such that for all fc, u • fc G IF fc < to. 
As for each A:, u • A: G 3^, it is possible to determine no, . . . n^-i such that for 
each k < m and a £ J- ii a starts with u • k, then a(uk) £ lA. To show that 
n G 3^, we choose n to be max {n^ | k < to} and use the monotonicity of lA. 



2.4 Other Formulations of the Fan Theorem 

So far, we have used the terminology which is standard in the literature. It is 
possible to give alternative presentations of the fan theorem, some of which, 
are actually not formulated in terms of fans but in terms of arbitrary finitely 
branching trees. 

In this section, we explore other formulations of the fan theorem with the 
purpose of obtaining one which is easier to represent in type theory. We shall see 
that there is no need to introduce notions like fan or tree in type theory, since 
the fan theorem can be reformulated without explicit use of those notions. 

Some of the formulations that we will introduce are in terms of a special kind 
of tree, which we call independent- choice trees. 

Definition 5 (independent-choice). An independent-choice tree is a tree I 
such that for all u,v £ 2 of equal length, 

Vn ^ • n £ 2 V • n £ 2]. 
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There is a one-to-one correspondence between independent-choice fans and infi- 
nite sequences of nonempty finite subsets of Af. An independent-choice fan X is 
uniquely determined by a sequence Xq,Xi, . . . of nonempty finite subsets of Af. 
The branches of X of length n are obtained by choosing one element from each of 
the sets Xq,Xi, . . . ,X„-i in that order. Every choice is independent of the other 
choices done to determine the branch. Similarly, there is a one-to-one correspon- 
dence between independent-choice finitely branching trees and (not necessarily 
infinite) sequences of nonempty finite subsets of Af. 

The notion of independent-choice tree turns out to be very useful for obtain- 
ing a reformulation of the fan theorem easier to interpret in type theory. 

We list first a few statements equivalent to Theorem 1. 

Theorem 2 (alternatives to fan theorem). The fan theorem is equivalent 
to the validity of 



V monotone bar lA 3n Vo S T [a{n) € hC\ 
in any of the following cases: 

1. for all fan T , 

2. for all finitely branching tree X , 

3. for all independent- choice fan X, 

4- for all independent- choice finitely branching tree X . 

The only difference between the fan theorem and item 1 is that in the lat- 
ter U runs over bars on the universal spread, rather than over bars on the fan. 
With this modification, the fan theorem can be formulated for finitely branching 
trees as well (item 2). On the other hand, it is enough to restrict attention to 
independent-choice fans or trees (items 3 and 4). 

To prove Theorem 2 notice that the domain on which X ranges in item 2 
includes the one on which it ranges in item 1, and so item 2 item 1. Anal- 
ogously, item 2 ^ item 4, item 1 item 3, and item 4 item 3. Similarly, 
the domain on which hi ranges in the fan theorem includes the one on which it 
ranges in Theorem 2, so Theorem 1 item 1. 

To finish the proof of Theorem 2 it is enough to prove that item 3 item 2 
and item 1 Theorem 1. For the former, let X be an arbitrary finitely branching 
tree and lA an arbitrary monotone bar. Let X be the least independent-choice fan 
containing X. Determine n such that Va G X \a{n) € lA]. As a € X ^ a € X, 
we obtain Va G T [a(n) G lA]. 

Finally, to prove that item 1 Theorem 1, let F be an arbitrary fan and lA 
an arbitrary bar on F . Define V = lAUF'^. By Proposition 1, V is a bar. Then, by 
item 1 there is an n such that for all a G IF, a{n) G V. As a{n) G F, a{n) G lA. 

Observe how letting lA run only over bars on the universal spread rather than 
over bars on T, opens the possibility of a number of alternatives to the original 
formulation of the fan theorem. Indeed, Veldman showed that the same kind of 
alternatives do not hold intuitionistically if lA is taken to be an arbitrary bar 
over X. This is further explained in Section 5. 
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The next theorem says that more formulations can be obtained, where quan- 
tification over infinite sequences of natural numbers is avoided. 

Theorem 3 (more alternatives to fan theorem). The fan theorem is equiv- 
alent to the validity of 

V monotone bar lA 3n Vm S T [lengthiu) = n u £lA] 
in any of the following cases: 

5. for all fan T, 

6. for all finitely branching tree T , 

7. for all independent- choice fan T, 

8. for all independent- choice finitely branching tree T . 

Just as in the proof of Theorem 2, it is easy to obtain that item 6 item 5, 
item 6 item 8, item 5 item 7, and item 8 item 7. Item 6 follows from 

item 7 in the same way as item 2 followed from item 3 in Theorem 2. 

Finally, the equivalence between item 5 and item 1 of Theorem 2 is also easy, 
since given a fan T, every m € T of length n is equal to o(n), for some a ^ T . 

Theorem 4 (one more alternative to fan theorem). For all monotone 
barU and all infinite sequence Iq,Ti, . . . of finite subsets of N , 

3n [JqX...xJ„_i C U], 

where Xq x . . . x X„_i = {<ao, . . . , a„_i> | Vi G It]. 

Theorem 4 is equivalent to the fan theorem. 

Let T be the set (J{Xo x . . . x Ii-\ \ i G M}. Clearly, T is a finitely branching 
tree. By item 6 of Theorem 3, there is a natural number n such that all the 
sequences in T of length n belong to U. Those sequences are exactly the elements 
in the set Xg x . . . x X„_i. 

Conversely, to prove that item 8 of Theorem 3 follows from Theorem 4, let X 
be an arbitrary independent-choice finitely branching tree. Let X^ be the set 
{k G M \ 3u [length(u) = i A u» k G X]}. Given 7t G X of length n, we have 
M G Xq X . . . X Xji_i C U. 

The advantage of the formulation of the fan theorem as in Theorem 4 is that 
it avoids the notions of fan and finitely branching tree. Also, if we extend the 
definition of bar to sets of finite sequences of finite subsets of natural numbers, 
rather than only sets of finite sequences of natural numbers, then we may write 
the fan theorem in the following way. 

Let X range over finite sequences of finite subsets of Af, and denote the 
operation to obtain the Cartesian product of such a finite sequence, that is, 
<Xq , . . . , Tji— 1 > — Xq X ... X T^— \ . 

Theorem 5 (final reformulation of fan theorem). Given a monotone setU 
of finite sequences of natural numbers, iflA is a bar, then so is {X | {^X C U}. 

This is the formulation which is represented in type theory by Theorem 6, in 
Section 4. 
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3 Inductive Bars 

Following the Curry-Howard isomorphism ([CF58] and [How80]) every proposi- 
tion is formally represented in type theory by the set of its proofs. Predicates, 
subsets and families of sets are identified with each other, in the sense that every 
predicate over the elements of a set A, every subset of A, and every family of sets 
indexed by the elements of A, is represented by a function which when applied 
to an element of A returns a set. 

Given a predicate U over a set A and a list m in ^*, we let 

A" 

U 

mean that all the elements in the list u satisfy U. In type theory, it can be defined 
inductively with the following introduction rules. 

^(a) 

A<>^ Au,aU 

Notice that Au*v^ equivalent to ^ Av^- Associated to the definition of 
Au^ have the following principle of induction, for every predicate X over A* . 

A-Af X{<>) [X{v)hU{a)^ X{v»a)] 

X{u) 

When using this principle we refer to it as induction on “the” proof that Au^j 
where “the” proof is the proof of Air ^ available at that moment. 

In type theory, we formulate the definition of bar for predicates over lists of 
elements of an arbitrary set, rather than only for predicates over lists of natural 
numbers. The following definition is a variation of an idea taken from [Mar68]. 

Definition 6 (inductive bars). Given a set A and a predicate U over A*, U 
is an inductive bar ifU \ <> (to he read U bars the empty sequence A where this 
is inductively defined with the following introduction rules. 

U (u) U \u Va € A [A/ I M • a] 

U \u U \ u» a U \u 

Observe that if U (u) V (u) for every u G A*, then also U \u ^ V \u for every 

u G A*. Associated to the definition of Af | m we have the following principle of 
induction, for every predicate y over A*. 

U I u 

\/vgA* \u(v)^y(v)] 

\/vGA* \/a G a [3^(F) => y{v • a)] 

VU G A* {[Va € A yifv • a)] => 3^(F)} 

W) 
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When using this principle we refer to it as induction on “the ” proof that hi \ u, 
where “the” proof is the proof of it available at that moment. 

With this principle of induction it is possible to prove in type theory that 
the rule BI — with inductive bars instead of bars, and arbitrary sets instead of 
natural numbers — is derivable. That is, that the rule Blrprp^ below, is derivable. 



This rule can be derived by showing VU € A* y(u v) by induction on the 
proof that X \ u, where is a combination of the reverse and append functions, 
and is defined as follows. 



Observe that whereas BI is an axiom of intuitionistic logic, BIrprp can actually 
be proved in type theory. This is because of the definition of inductive bar, 
whose equivalence with the standard notion of bar in Definition 4 — in the case 
of sequences of natural numbers — is essentially the content of BI itself. More 
precisely, for any predicate hi over Af* , hi \ <> implies that hi is a bar, but the 
converse is the content of the axiom BI. 

In a type-theoretic context, by bar induction we refer to the rule BIrprp. 
When applying bar induction we will refer by monotonicity condition (of T), 
hereditary condition (of 3^), and inclusion condition (that is, X C y) to the 
instances corresponding to the premises of the rule. 

Proposition 2. Given a set A, a monotone predicate hi over A* and a list u 
of elements of A, then 



where Vu = \v lA{u * v) . 

The part is easy, and is left to the reader. Hint: use bar induction 
with X = Vu and y = Xv ^ \u *v]\ or, for another proof which does not 
use monotonicity oilA, by induction on the proof that Vu \ <>■ 

We sketch a proof of the part, which is by bar induction with X = hi 
and y = Xu \Vu | <>]. The monotonicity condition is hypothesis of the propo- 
sition and the inclusion condition is trivial. It remains to prove the hereditary 
condition. Assume that for all a € A, | <>. We have to show \ <>. In 
order to do so, we prove that for all a S A, Vjr | <a>. Now, this follows from 
hu.a I <> by bar induction with X = Vu,a and y = Xv [Vu \ <a> * U]. 



Vue A* [x{u)^y{u)] 

Vu e A* Va e A [x{u) X{u» a)] 
Vu e A* {[Va € A y(u • a)] => 3^(w)} 
A I u 



X is included in y 
X is monotone 
y is hereditary 
X bars u 




y{u) 



u <> = u 

u ^ (v • a) = (u • a) ^ V 



hi I u V^r I <>, 
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Proposition 3. Given a set A, two monotone predieates U, V over A* and a 
list u of elements of A, then 

U\uAV\u^W\% 

where W = Xu [U(u) A V{u)]. 

We sketch a proof by bar induction with X = U and y = Xu [V \u^W | m] . 
The monotonicity condition is hypothesis of the proposition. The hereditary 
condition follows from the facts that Xu [W | u] is hereditary and Xu [V | u] is 
monotone. Finally, the inclusion condition can be proved by bar induction with 
X = V and y = Xu [hl{u) ^ W \ u], repeating the previous reasoning, except 
that the new inclusion condition is trivial. 



4 Fan Theorem in Type Theory 

The result we present here is a type-theoretic version of the fan theorem as 

formulated in Theorem 5, except that it will be expressed for an arbitrary set A 

rather than only for natural numbers. Finite subsets Xi of A will be represented 

by lists Ui of elements of A. Finite sequences X of such subsets, by lists u of lists. 

The function occurring in the statement of Theorem 5 will be represented 

by a function which when applied to a list of lists <mi, . . . ,Un-i > computes 

another list representing the Cartesian product X\ x ... x 2„-i- 

To define we first define the binary Cartesian product x-^ parametrized 

£ 

with a function /. Then, the finite Cartesian product also parametrized. 
Finally we instantiate it to obtain {^. 

Given a function f : A ^ B, we denote hy f : A* ^ B* the function which 
maps / on every element of its argument. 

/(<>)=<> 

f{u»a) = f{u) • f{a) 



Example 1. f{<ao , . . . , a„_i>) =</(ao), . . . , /(a„_i)>. 

Now, the function x^, which given a function f : A ^ B ^ C, and two lists 
u G A* and V G B* returns a variation of the Cartesian product of u and v. 
Instead of returning a list in {A x B)*, it returns a list in C* by applying the 
function / to the components of each possible pair. 

<> x^ V = <> 

(u»a) xf V = ux-^v * f{a){v) 

Example 2. m x-^ <> = <>, for every u. 



Examples. <ao,ai> x^ <bo,bi> = </(ao, &o), /(ao, &i), /(ai, &o), /(ai, &i)>- 
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The function given a function f : B A ^ B, a, base value b € B, and 
u G A** , returns a list in B*, each of whose values is the result of iterating the 
function / along one tuple, assigning b to the empty tuple. Each tuple consists 
of one element from the first list of u, one from the second, etc. in the style of 
the Cartesian product. 



{g)f(<>)= <b>_ 

{g)^(u. u) = (g)((u) x-fu 



Example 4- 

(g)^(«ao,ai>,<6o,6i») 

= <f{f{b,ao),bo),f{f{b,ao)Ai)J{f{b,ai)Ao)J{f{b,ai),bi)> . 



Finally, the Cartesian product is obtained by giving • as the function to 
iterate, and <> as the base value. 



{g)(u) = (g) (m) 

Example 5. (g)( <> ) = «». 

Example 6. {g)(M» <>)=<>, for every u. 

Example 7. {g)(<Kao, ai>, <6 q, 6i^) = «ao, 6 q>, <ao, 6i>, <oi, 6 q>, <oi, 5i^. 

The set {I \ (g) X C 7/} in Theorem 5 can be interpreted as a predicate V on 
lists u of lists which is true when (g)(M) is “included” vaU. As (g)(^ is actually 
not a set but a list, by it being “included” in lA we mean that every element 
in the list (g)(M) satisfies U, that is, Thus, the predicate V is in fact 

interpreted by the function Au 7/j . Hence, in type theory Theorem 5 

becomes: 

Theorem 6 (fan theorem in type theory). Given a set A and a monotone 
predicate U over A* , then ifU is an inductive bar, so is the predicate 



Xu 



A_" 

(g)(“) 



Lemma 1. The following properties hold for every u, v, w, and u 

1. u*v w = {u x-f w) * {v x-f w) 

2. <a> x^ u *v = { <a> x^u) * ( <a> x ^v) 

3. {g)(<I(J • a> *u) = {g)(<I(J> *u) * {g)(<5Ca^ *Tt) 

A(g)(i) [Au Af(<a> *m)] ^ A(g)(«a»*i)^ 
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Item 1 can be proved by induction on v. Item 2 follows from the fact that 
g{u*v) = g{u) *g(y) (letting g be /(a)), which can also be proved by induction 
on V. Item 3 can be proved by induction on u, using Example 6 in the base case 
and item 1 in the inductive case. 

Though technically laborious, item 4 is intuitively clear since all the tuples 
in {g)(«a^ *u) are of the form <a> *u with u a tuple in We omit that 

proof here. 

For the proof of Theorem 6, we define, for u & A* , 



Vu = Xu 



y/y {XvU{u*v)) 



We present a proof by bar induction with X =U and y = Xu {Vii | <>}• 

The inclusion condition is U{u) => V« | <>, which is easy, since when U{u) 
holds, even Vtr( <> ) holds because = «» by Example 5. The 

monotonicity condition is hypothesis of the theorem. The hereditary condition 
is (Va € A [Vu,a I <>]) '^u I <>• We assume 



Va G Al [Vu,a I <>] 



( 1 ) 



and given an arbitrary v we prove \ <v> by induction on v. 

If V = <> , then Vu \ «» is direct since V^(«») holds because of the 
facts that {^(«») = <> holds by Example 6 and that A <> is trivially true 
regardless of the predicate. 

If T = W • a for some W G A* (such that Vu \ <w>) and a G A, then we 
know by (1) that 

I <> and V^r | <w> 

and still have to prove 

Vu I <w • a> . 

By Proposition 2 it can be written like this: we know 

V^i.Q I <> and [Al Vu{<W> *1)] | <>, 

(hence, by Proposition 3 we know also that 

[Xu [Vu,a{u) A Vu{<w> *u)]] I <> (2) 

holds) and have to prove 

[Xu V 5 i(<I(J • a> |<>. (3) 

To prove that (2) (3), it is enough to prove that for every u G A**, 

'^’u,a{u) A Vu{<w> *u) Vu{<w»a>*u) 



holds, because of the observation made after Definition 6. By the definition of 
Vu and item 3 of Lemma 1, the right-hand side is equivalent to 

V57(<uJ> A Vui^a:^ *u) 

which follows from the left-hand side because, by item 4 of Lemma 1, Vu,a{u) 
implies Vii(«a^ *u). 
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5 Concluding Remarks 

The main contribution of this article is the formalization of a version of the fan 
theorem in type theory. That is, the formulation of Theorem 6 and its proof. 

However, obtaining a version of the fan theorem which admits a direct in- 
terpretation in type theory turned out to enrich the content of the article by 
presenting a variety of equivalent formulations of the fan theorem. 

The key to obtain such variety was letting U range over bars on the universal 
spread rather than over bars on the fan T, in Theorem 2. Indeed, Veldman [Vel98] 
showed that analogous variations do not hold intuitionistically if U is taken to 
range over bars on T. In that case, for instance, item 2 of Theorem 2 fails to 
hold. 

Veldman proved this by providing a counterexample which relies on the 
Brouwer-Kripke-principle (see for instance [VeiSl] for the statement of the prin- 
ciple). The construction of the counterexample is as follows. 

Let V be some unsolved problem, and stable that is, such that the double 
negation of V implies V. The Brouwer-Kripke-principle, yields an infinite se- 
quence of bits /3 such that 



V 4=^ 3n P{2n) = 1, and 
-nV 3n P{2n + 1) = 1. 

Let 0(m) and l(m) denote 

m m 

0(m) =<0, . . . , 0> 1 (to) =<1, . . . , 1> . 

Let T be the tree consisting of the following finite sequences of bits: as long 
as /3(0) = /3(1) = . . . = /3{m — 1) = 0, the sequences 0(m) and l(m) belong 
to T. As soon as /3(n) = 1 for the first time, then either such an n is odd, in 
which case 0 (to) belongs to T for every m > n, or n is even, in which case 1 (to) 
belongs to T for every m > n. And these are all the sequences belonging to T. 

Let U be the set of all finite sequences u such that there exists n < length ( m) 
such that f3{n) = 1. The set U is clearly monotone. 

Extending the notion of bar in Definition 4 in the natural way to arbitrary 
trees, rather than spreads, we prove first that U is a bar on T. This is so since 
given any a £ T, either a(0) = 0 (and a(n) = 0 for all n) or a(0) = 1 (and 
a{n) = 1 for all n). In either case, thanks to the stability of V, V is solved. Thus, 
a £ T implies that V is solved, in which case U is a bar, hence 3n a{n) £ U. 

On the other hand, determining n such that Va £ T [a{n) £ U] amounts to 
solving V, which by assumption is unsolved. 

Therefore, this gives a counterexample to the statement 

U monotone bar on T => 3n Va £ T [a(n) £ U]. 
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1 Introduction 

The A-calculus and its typed versions are important tools for studying most fun- 
damental computation and deduction paradigms. However, the non-trivial na- 
ture of substitution, as used in the definition of A-reduction notably, has spurred 
the design of various first-order languages representing A-terms, A-reduction and 
A-conversion, where computation is simple, first-order rewriting, and substitu- 
tion becomes an easy notion again. Let us cite Curry’s combinators [7], Curien’s 
categorical combinators [5] , the myriad of so-called A-calculi with explicit substi- 
tutions, among which Act [1], Acr.(^ [12], Xv [17], X( [19], etc. Unfortunately, each 
one of these calculi has defects: Curry’s combinators do not model A-conversion 
fully, categorical combinators. Act and Acr^ do not normalize strongly in the 
typed case [18], Xv is not confluent in the presence of free variables (a.k.a. meta- 
variables), X( models A-conversion but not A-reduction, etc. 

SKInT [9] is a first-order language and rewrite system that does not have 
these defects. In particular, SKInT models reduction in the A-calculus, in the 
sense that there is a mapping L* from A-terms to SKInT such that whenever u 
rewrites to v in the A-calculus, then L*{u) rewrites to L*{v) in SKInT. SKInT is 
confluent even on open terms, i.e. terms with meta-variables; L* defines a con- 
servative embedding of the A-calculus inside SKInT, in that u and v are A- 
convertible if and only if L*(u) and L*(v) are convertible in SKInT; reduction in 
SKInT standardizes; L* preserves weak normalization, i.e., if u is a weakly nor- 
malizing A-term, then L*(u) is a weakly normalizing SKInT-term. SKInT also 
enjoys a simple type discipline corresponding to that of the A-calculus, that is, 
if u is a A-term of type r, then L*{u) is a SKInT-term of some type L*{t) eas- 
ily computed from r; reduction in SKInT obeys subject reduction, and every 
simply-typed SKInT-term normalizes strongly [9] . 

The aim of this paper is to extend our result on preservation of weak normal- 
ization to show that SKInT, like Xv and A(j', also preserves strong normalization, 
and also solvability; this is done by showing that, just like in the A-calculus, 
strongly normalizing, weakly normalizing and solvable terms are characterized 
as terms that are typable in various conjunctive type disciplines [20,4]. 

The plan of the paper is as follows: we introduce the required notions and 
notations in Section 2, then we attack our goal in Section 3. We show that any 

* This work has been done in the context of Dyade (Bull/Inria R&D joint venture). 
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reasonable translation from the A-calculus to SKInT preserves solvability, weak 
normalization and strong normalization (Corollary 1), where “reasonable” means 
that it preserves typability in certain systems of conjunctive types; this holds 
in particular for the translations of [9]. In turn, this result follows from a result 
stating that all terms that are typable in particular systems of conjunctive types 
(to be defined in Section 3) have the corresponding normalization properties 
(Theorem 1). The converse also holds (Corollary 2), just as in the A-calculus. 

The proofs have been reduced to keep the paper short and reasonably legible. 
Full proofs can be found in [11]. 

2 SKInT and the A-Calculus 

Recall that the syntax of the A-calculus is [3] : 

t ::= X \ tt \ Xx ■ t 

where x ranges over an infinite set of so-called variables, and terms s and t 
that are a-equi valent are considered equal; we denote A-terms by s, t, . . . , and 
variables by x, y, z, etc. a-equivalence is the compatible closure of Xx - {t[x/y]) = 
Xy ■ t and t[s/a;] denotes the usual capture-avoiding substitution of s for x in t. 
We shall write = for a-equi valence; in the first-order calculi to come, = will 
denote syntactic equality. 

The basic computation rule is /3-reduction, the compatible closure of: 

(/3) (Ai • t)s ^ t[s/a:] 

The relation — s- is the compatible closure of this relation, — is the reflexive- 
transitive closure of the latter, and — is its transitive closure. We shall use 
— >, — >*, — >■'" ambiguously in other calculi as well, taking care to make clear 
which is intended. 

We shall also add the following ? 7 -reduction rule: 

( 77 ) Xx ■ tx ^ x (x not free in t) 

to the A-calculus, yielding the so-called A^-calculus. The corresponding compat- 
ible closure relation will sometimes be noted — to distinguish it from — >, 
and similarly for the calculi to come and their respective 77 rules. 

The terms of SKInT, and of its companion calculus SKIn [9], on the other 
hand, are defined by the grammar: 

u\\= x\ h\ Se{u,u) I Ke{u) 

where £ ranges over IM. This is an infinitary first-order language. The reduction 
rules of SKInT are shown in Figure 1, thus defining an infinite rewrite system. 
The semantical idea behind SKInT, or SKIn, will be made clear by stating an 
informal translation from SKInT (or SKIn) to the A-calculus. Intuitively: 

I( ~ Xxq ■ ■ Xx^—i ■ Xxf: ■ Xg 

Se{u, v) ~ Xxo ■ . . . ■ Xxg-i ■ uxo . . ■ xg-i{vxo . . . xg-i) 

Kg{u) ~ Aio • . . . • Xxg-i ■ Xxg ■ uxq . . ■ xg-i 
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So If,, Sf, Kf generalize Curry’s combinators /, S and K respectively. 

SKIn is defined as SKInT, except that rule {KfSc+i) is replaced by (KfSc)' 
Kf{Sc-i{u,v)) — > Sc{Kf{u), Kf{v)); conversely, SKInT is as SKIn, except that 
rule {KfSc) is restricted to the case £< C — 1. 



(57,) Se{If,w)^w (SKf) Si{Kf{u),w) ^ u 

(Sflc) Sfilcw) ^ Ic-i (Kflc) Kf{Ic-i)^Ic 

(SfKc) Sf{Kc{u),w) ^ Kc-i{Sf{u,w)) {KfKc) Kf{Kc-i{u)) ^ Kc{Kf{u)) 

(SiSc) Se{Sc{u,v),w) (KeSc+i) Kf{Sc{u,w)) 

^ Sc-i{Sf{u,w),Sf{v,w)) ^ 5£+i(77,(u),77,(w)) 



Fig. 1. SKInT reduction rules (for every 0 < £ < £) 



Both SKIn and SKInT are confluent and standardize. (See Section 3 for the 
definition of standard reductions.) 

We can split SKInT in two: the set of all rules (Slf), t > 0, corresponds 
somehow to the actual /3-reduction rule of the A-calculus, or more precisely to /Si- 
reduction (the notion of reduction of A/), and we shall call this group of rules /3J. 
All other rules essentially correspond to the propagation of substitutions in the 
A-calculus, and we call the set of these rules ST. Similarly, S is SKIn minus (31. 
It turns out that both E and ST are confluent, but ST terminates while E only 
normalizes weakly (even in a typed setting, see [9]). 

We shall also consider SKInT^, which is SKInT plus the following group 77 : 

(r]Si) Sf+i{Kf{u),If) ^ u (£ > 0) 

SKInT^ is also confluent, and //-reductions can be postponed after all other 
reductions, just like in the A-calculus. 

The natural translation from the A-calculus to SKInT, resp. SKIn, is 1 1 -^ t*, 
defined in Figure 2. Whenever u — *■ v in the A-calculus, u* — v* in SKIn, 
but not in SKInT. In the case of SKInT, we have to use a more complicated 
translation, like L*-. then u — > v implies L*{u) — >“*■ L*{y). L*{s) is defined 
as (L(s))*, where L{s) is Xz ■ Lz{s) with 2 : a fresh variable, and Lz{x) =d/ 
xz, Lz{Xx-s) =d/ Xx-Lz(s), Lz{st) =d/ Lz{s){L{t)) (see [9] for an explanation) . 

The simple type discipline for the A-calculus is defined by judgments T \- t : t, 
where 7 is a A-term, r is a simple type, i.e. a term in the following language: 

r ::= B \ t —>■ t 

where 73 is a given non-empty set of so-called base types. Finally, T is a context, 
namely a finite map from variables to types; T,x : t denotes T enriched by 
mapping x to r, where x is outside the domain of T. 
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X* = X 

{st)* =So(s*,t*) 

= [a;](r) 



[x]x = lo 

[x]y = Ko{y) (y 7 ^ x) 

Mih) =h+i 

[x](S'^(u, x)) = S«+i([x]u, [x]x) 
[x]{Ki{u)) = Ki+t_{[x\u) 



Fig. 2. Translation from the A-calculus to SKIn 



The typing rules for the A-calculus are: 

T h s : Ti ^ T 2 T h t : ti T, x : Ti h s : T 2 
r,x : T \- X : T r \- st ■. T2 T h Ax • s : ti — > T2 

Those for SKInT are shown in Figure 3. Notice that in typing Ki{u), the 
new type r' is always inserted before an arrow type: this is intentional (see [9]), 
and is related to insights in proof systems for the modal logic S4. 



r,X \ T \- X \ T r \- It \ To ^ ^ Tf_l T£ ^ re 

T h M : To ^ . . . — > Te-l ^ Tt ^ T 

r V : To ^ Ti-I Te 

r Se{u,v) : To — » . . . ^ Te-i — » r 

Fig. 3. Simple types for SKInT 



r h M : To ^ ^ Te-i ^ Te —> T 

r h Ke{u) : To —> ■ ■ ■ —> re-i t' ^ re ^ t 



Both the A-calculus and SKInT enjoy subject reduction and normalize 
strongly on simply-typed terms. Moreover, we have the following meta-theorem 
on SKInT. Call a context F arrowed if and only if F maps every variable in its 
domain to an arrow type (of the form t\ T 2 ); then for every arrowed con- 
text F, if F, X : Ti h M : T 2 , then F h [x\u : ti — *■ T 2 . The restriction to arrowed 
contexts is necessary: consider the case where u is a variable other than x. 



3 Conjunctive Types 

We now turn to the relationship between conjunctive type systems for SKInT and 
termination properties, a la Salle-Coppo [20,4]. We first recall a few notions 
from [9]. Define the spines S by the grammar: 



S::=x\Ie\ SeS \ KeS {£ > 0) 
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The arity of a spine is the number of operators of the form £ > 0, in it. If n 
is the arity of S, and vi, . . . , Vn are n SKInT-terms, then the term , Vn] 

is defined by: 

x[]=dfX =df Si{S[vi, . . . ,Vn-l],Vn) 

Il[] =df li {KiS)[vi,. ..,Vn] =df Ki{S[vi, . . .,Vn]) 

Every term u can be written in a unique way as . . . , Vn\- the spine S is the 
sequence of operators occurring along the leftmost branch of u, read top-down. 
Then the terms vi, . . . , Vn are the second arguments of operators of the form 
Si, i > 0, on the spine, read bottom-up. The terms v\, . . . , Vn are called the 
arguments of u. Iterating this decomposition of terms in spine and arguments 
allows us to see terms as trees of spines. 

Call a one-step spine reduction u — v any one-step reduction of a redex 
occurring on the spine of u. A spine reduction u — r; is a sequence of one- 
step spine reductions. A term that has no one-step spine contractum is called 
spine-normal. Spine reductions play the role of head reductions in the A-calculus. 
(However, spine reductions are not unique.) 

Define standard reductions u — v by induction on v viewed as a tree 
of spines, if and only if u — 5[mi, . . . , u„], and v = S')?;!, . . . , where 
Ui — s-®**^* Vi for each i, 1 < i < n. SKInT standardizes [9], in that u — !-*?;(in 
SKInT) implies u — v. In particular, a SKInT-term u is weakly normaliz- 
able if and only if it has a normalizing standard reduction. 

Definition 1 u is solvable if any of the following equivalent conditions hold: 

(?) All SKInT-spme reductions starting from u terminate; 

(ii) Some SKInT-spme reduction starting from u terminates. 

Proof. That (?) implies (??) is clear. Conversely, write ??=>?; when u is ST- 
normal, has a spine (31-redex Si{Ii,w), and v is the unique AT-normal form of 
the spine /3J-contraction of u. (Notice that any term has at most one spine (31- 
redex, hence =^>-reductions are unique.) By examining how rules commute, we 
can show that if u — v by using (31 n times, then ST{u) =>* ST{v) in 
exactly n steps (see the appendices to [11]). It follows that, if (??) holds (with 
termination in n spine (31-steps), then any =^>-reduction starting from u termi- 
nates (in exactly n steps), and therefore all spine reductions do exactly n spine 
(31-steps-, as ST terminates, all these spine reductions must be finite. □ 

We wish to characterize solvable, weakly normalizing and strongly normaliz- 
ing as terms that can be typed in some systems of conjunctive types. It will be 
profitable to use a simplified format for conjunctive types, due to S. van Bakel [2] 
(see also [21]), where the type of any given term is unique — modulo the choice 
of types for occurrences of variables — and has a well-defined arity: this is in con- 
trast with the usual sort of conjunctive types (see e.g. [15,8]). Define the strict 
intersection types r by: 

T ::= B \ p ^ T 



p ::= [n, . . . ,r„] (n > 0) 
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where [ri, . . . , r„] is the multiset of types ti, . . . , r„. Intuitively, [ri, . . . , r„] 
denotes the intersection of ri, . . . , r„. If n = 0, [] denotes the set of all terms. 
We shall also write uj for [], and A ^2 for the multiset union of /ii and ^ 2 - 
Strongly normalizing A-terms are those that are typable in system S, defined 
as follows. Call S-types the types generated by the grammar: 

t-.:=B\pl^t ^ [ti, . . . ,r„] (n > 1) 

That is, multisets of types are now restricted to be non-empty. We define sys- 
tem S as the system of Figure 4 (again), but where r and /r-types are restricted 
to be 5-types. 



r \- u \ [ri , . . . , r„] T r, X ■. y, \- u ■. r 

r \- V : ... r \- V : Tn 

r, X u A T \- X ■. T ~ r \- \x ■ u \ y ^ T 

1 \- UV \ T 

Fig. 4. Conjunctive types for the A-calculus 

Correspondingly, we endow SKInT with the typing rules of Figure 5, yielding 
a typing system that we call Sui (when types are 5w- types), or S (when types 
are 5-types). 



r, X : y A T \- X : T 



r\- le-yo ^ . . .— » ye-i ^ ye At ^ t 



F h M : /To — > . 


. . ye-i [ri, . . 


. ,-r„ 


P V : yo ^ 


. . ^ ye-i Ti 




P V : yo ^ 


T 

1 

T 





r h St{u, n) : /To — > . . . ^ ye-i — » r 



r \- u : yo ^ ^ ye-i 

^ ye ^ r 

r h Ke{u) ■. yo —> ... ^ ye-i 
^ y ^ ye ^ T 



Fig. 5. The system of conjunctive types for SKInT 



Call an 5w-type r definite positive if and only if lu only occurs negatively in r. 
More formally, the definite positive types t+ and the definite negative types t~ 
are defined by the grammar: 

T+ ::= B \ y- ^ T+ y+ ::= [t+ , . . . ,t+] {n > 1) 

T~ ::= B \ y+ ^ T~ y~ ::= [rfi , . . . , Tfi] (n > 0) 

A context P is definite negative if and only if every binding in P is of the form 
X : y~ with y~ definite negative. We say that the typing judgement F h s : r is 
definite positive if and only if P is definite negative and r is definite positive. 
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We first have the following, which we leave to the reader to check: 

Lemma 1 (Subject Reduction). If F \- u : t in Su>, resp. S, and u — !■* v 
in SKInT or SKInT,,, then F \- v : t in Sai, resp. S. 

Theorem 1 (Normalization). The following holds: 

(i) If F \- u : T is derivable in system Sco, then u is solvable; 

(ii) If F~ \- u \ T~^ is definite positive and derivable in system Stu, then u is 
weakly normalizing in SKInT and in SKInT^; 

(Hi) If F \- u : T is derivable in system S , then u is strongly normalizing in 
SKInT and even in SKInT^. 

Proof. The idea is to translate the term m to a A-term, and to use the correspond- 
ing results for the A-calculus. We modify the u |u](sq, . . . , J translation 
of [9] to map SKInT-terms to Xt^^-terms: the idea is that instead of dropping 
some arguments (like si in the definition of |K^(it)](so, • ■ ■ , Sn-i) for n > £), 
we shall keep them on the left of some binary operator 0 such that s © t is 
semantically equivalent to t alone. 

The A 0 e-calculus is defined by the grammar: 

t ::= X \tt \ Xx ■ t \ et\t(Bt 

and its reduction rules are / 3 ? 7 -reduction plus: 

(e) et^t (©-) ti®t 2 ^t 2 (0) (ti ©fy) ©fy ^ fy © (fy ©fy) 

We define type systems that we call again Suj, resp. S, defined on iSw-types, 
resp. 5-types, and whose typing rules are those of Figure 4, plus: 

F\- t ■. T [F \- ti : Ti] F \- t2 ■ T2 

F \- et : T F \- ti (B t 2 : T 2 

where the bracketed premise F h ti : ti is included in S, but omitted in Suj. 

There is an erasing translation f i— s- |t| from Aq^ to the A-calculus (with /3- 

reduction): \et\ = \t\, |fy©t 2 | = |fy|, \x\ = x, \tit 2 \ = |ti| |fy|, \Xx ■ tx\ = \t\ if a: 

is not free in t, and |Aa; • t| = Aa; • \t\ if t is not of the form t' x with x not free 
in t'. Aqc has the subject reduction property, and for every A 0 e-term t: 

(a) If F h t : r in Suj, then t is solvable, i.e., all head-reductions starting from t 
terminate. We call head-reduction in Aq^ any (e), (©— ), (©) or (ryj-step, or 
any (/3)-reduction step s — > t such that |s| — > |f| by a head (/3)-reduction 
step in the A-calculus (i.e., the A 0 e-redex does not get erased). 

(b) If F h f : r in 5, then t is strongly normalizing. 

The proofs are by appealing to the same properties in the A-calculus, using the 
erasing translation above (see [11]), or by reducibility methods (for (b)): see [10]. 
Define the dimension dim it of a SKInT-term u by: 

dim a; =d/ —1 dimfy =d/ i 

dim.Ki{u) =d/ max(£,dimM) 0 1 dim5f(u,ri) =d/ max(£,dimu) — 1 
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Me (so, • ■ • , s„_i) =d/ 



when n > dimw 



' XSO ■ ■ ■ Sn-l 

e{so © ... © se-i © S£)s£+i . . . s„_i 
We(so, . . . , se-i, St © st+\, st+2, ■ ■ ■ , Sn-i) 
He(so,...,s^-i, 

‘ M]e(so, . . . , s^_i), . . . , s„_i) 



if M = a; 
if u = It 
if u = Kt{v) 

if M = St{v, w) 



1^1 e(S0,...,Sn— l) — df ^Xji • ... • Aa^Tn — 1 ' |'lij 0(so, . . . ,S7i— l,Xn, . . . ,a^m— l) 



when n < m = dim at + 1 



Fig. 6. Interpretation of SKInT-terms as AQ^-terms 



The new translation u i— > |u]0(so, . . . , s„_i), parameterized by a list sq, . . . , Sn-i 
of A0£-terms, is described in Figure 6. 

By abuse of language, say that F h s : ^ is derivable in Suj, resp. S, where 
p, = [ri, . . . , Tfc], if and only if F h s : t; is derivable in Stu, resp. S, for every i, 
1 < i < k. An easy structural induction on u (see [ 11 ]) shows that, if F h u : 
p,o ^ Mn-i — ^ T in Slu, resp. S, and F h Si : in Suj, resp. S, for every i, 

0 < i < n, then F h |u]0(sq, . . . , s„_i) : r in Suj, resp. S. 

A tedious check now shows that whenever u — > v in SKInT^, then for every 
sequence sq, ..., s„_i of A0£-terms, |u]0(so, ■ ■ • , Sn-i) — M0(so, . . . , s„_i) 
in A0£, resp. — >+ in the case of (Sit), {rjSe) and a few other rules (see [ 11 ]). 

By (b), any reduction R in SKInT, resp. SKInT^, starting from a typable 
term in system S uses only finitely many instances of rules {SR) and {rjSt), 
£ > 0 . But since ET terminates, there are finitely many reduction steps (in ET) 
inbetween two {SI() or {r]Si)-steps. So R is finite, proving (in). 

To show (i), first notice that if u — v in SKInT (or SKInT^), then 
M©(so,...,s„_i) — M0(so,...,s„_i) (resp. — >+ if u — v by {SR), 
{rjSi) and a few other rules) by head-reductions in A0e. By (a), any spine- 
reduction step in SKInT, resp. SKInT^, starting from a typable term in Suj 
has only spine-reductions that use finitely many instances of {Sit) or {rjSt), 
£ >0. Since ET terminates, {i) follows. 

To show {a), we would like to use a similar argument, but any A0e-term with 
a definite positive typing normalizes only weakly, and then we need to show that 
we can lift back the given normalization strategy in A0e to some normalization 
strategy in SKInT; this is not easy. Instead, we observe the following. Let u' be a 
spine-normal SKInT-term, and write u' as S'[mi, . . . , Uk]- We may then write S as 
a word of the form . . . 

with L = Ij ov L & variable (in which case we let j =d/ — 1 ), with: 

ioi > . . . > iono > jl > ill > ■ ■ . > ilni > > ■ ■ ■ > jk > ikl > ■ ■ ■ > iknf, > j > ~f 

and fc > 0, no > 0, ni > 0, . . . , rife > 0, and when rii = 0, the notation ji > 
iii> ... > iim > ji+i means ji > ji+i - If n > dimn', then we have: 
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10 (^ 0 ? • • ■ ; ^n— 1 )| 

||ui]©(so,...,Sjfc_i)||s, 



3k \ 



■ ■ I 



IIU2]©(S0, • ■ • , • ■ • 



( 1 ) 



I I^fc]©('50j • ■ ■ J Sii— 1 ) I I ■ ■ • |sOno I ■ ■ • |s01 1 ■ ■ • Sn— 1 

where L' = |sj| if L = Ij, or L' = a; if L is a variable x, and . . . Sj 

denotes the sequence of terms Si, . . . , Sj from which the terms , . . . , Si^ 
have been omitted. If n < dimu', then ||m'] 0 (so, • ■ • , Sn-i)| is the ry-normal 
form of Xsn • . . . • Asdimu' • |[m']©(so, . . . , Sdimu')li is therefore of the form 

XSn * . . . ■ AStti—I ■ IM©(so J • ■ ■ 7 — 1 ) I ■ 

As far as types are concerned, let u' have a definite positive typing. Any (def- 
inite positive) 5o;-type of u' must be of the form /tq ^ Mioi Mioi-i-i ''"j 

and the typing derivation leading to this type must map each 1 < i < A:, to 
the types /tq ^ ''"ipj 1 £ P £ Wi, for some types Tip. Let /r' 

be [Til, • ■ • , Tirtii]- Then this typing derivation also gave L the type: 



Mo 






m'i 












( 2 ) 



Mifc-i-i 



Mfe-i ^ Mj2 
Mfe ^ Mil ~ 



Milni 



^ Mill ^ ^ Mji-1 

Miol Mioi-l-1 



If L is a variable, since the typing context is definite negative, the type above 
must be definite negative, so in particular every /i', 1 < z < fc, is definite positive. 
Since the type of u' was assumed definite positive, every fij is definite negative, 
so the types /zq ^ Min+i-i-i Tip of Ui are definite positive. Similarly, 

if L is Ij, then recall that j < iunt. (if nk yf 0) or j < jk (if Uk = 0 and 
A: yf 0), and by the form of the typing rule for Ij, every ^I'i, I < i < k must occur 
negatively in /i^, hence is definite positive (if A: yf 0; this is trivial if A; = 0), hence 
again the types assigned to Ui are all definite positive. 

Having made these remarks, let u be a SKInT-term with a definite positive 
S'w-typing. Then ||m] 0 (so, . . . , s„_i)| is a A-term with a definite positive Stu- 
typing, for any sequence sq, . . . , Sn-i of the right types, hence it /3-normalizes 
weakly. Recall that a weakly normalizing A-term t has a finite Bohm tree: let h(t) 
be the height of this tree. We show that, under the assumption that u has a defi- 
nite positive 5w-typing and that | |M] 0 (a;o, . . . , Xn-i) \ normalizes weakly for some 
sequence of variables xq, . . . , Xn-i, then u SKInT-normalizes weakly. This is by 
induction on h{t), where t = ||M] 0 (a;o, . . . ,Xn-i)\- First, by (i) and since u has 
an S'^i-typing, u is SKInT-solvable: let u' =d/ ^[mi, . . . , Uk] be any spine-normal 
form of u. Since u — u', as in case (z), ||zz] 0 (a;o, . . . ,Tri-i)| head-rewrites to 
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||u'] 0 (a;o, . . . , a;„_i)|, and the latter is head-normal, since Xj is a variable (by 
inspection of Equation 1). By the remark on the types of spine-normal forms 
(Equation 2), each Ui, 1 < i < k, also has a definite positive typing. Let now ti 
be ||ui]©(a;o, . . . by Equation 1, h{U) < h{\lu'j^{xo, . . . ,Xn-i)\) = 

, a^n_i)|). So the induction hypothesis applies: each Ui SKInT- 
normalizes weakly, say to some term Vi. Therefore u SKInT-normalizes weakly 
to . . . ,Vk]- This proves (ii) in the case of SKInT-reduction. 

For SKInT^-reduction, notice that every SKInT-normalizable term is also 
SKInT^-normalizable. Indeed, observe that if u — > v by the 77 -rule {rjS(), and u 
is SKInT-normal, then so is 7 ;. □ 

The Normalization Theorem has an important corollary. Recall that a trans- 
lation from A-terms to a given language (say, SKInT) preserves strong normal- 
ization (resp. weak normalization, solvability) if and only if the translation of 
every strongly normalizing A-term (resp. weakly normalizing, resp. solvable) is 
strongly normalizing (resp. weakly normalizing, resp. solvable). 

Corollary 1 (Preservation of Normalization Properties). Every transla- 
tion mapping S-typable X-terms to S-typable SKInT -terms preserves strong nor- 
malization. Every translation mapping Sui-typable X-terms to Suj-typable SKInT- 
terms preserves solvability. Every translation mapping X-terms having a definite 
positive typing in Suj to SKInT-terms having a definite positive typing in Suj 
preserves weak normalization. 

Proof. Notice that these translations need not map A-terms to SKInT-terms 
of the same type: we just need to preserve typability, not the types themselves. 
Every strongly normalizing (resp. weakly normalizing, solvable) A-term is typable 
in system S (resp. in Sto, in Suj with a definite positive typing) [2,21]; the result 
then follows from Theorem 1. □ 

It follows that the L* translation of [9], in particular, preserves strong normal- 
ization, weak normalization and solvability. This works also in the presence of 
77 -rules. Also, Corollary 1 is stronger: essentially, any reasonable translation from 
the A-calculus to SKInT will preserve all three normalization properties. 

Corollary 1 depends on the fact that strongly normalizing, resp. weakly nor- 
malizing, resp. solvable terms in the A-calculus are all characterized in terms of 
types. We end this section by showing that the same holds in SKInT. 

First, define S, resp. Suj-type substitutions 0 as finite maps from type vari- 
ables, a.k.a. base types in B, to S, resp. 5w-types. For any type or context a, aO 
denotes the result of applying 9 to a; [r/h] denotes the substitution mapping b 
to r. The following is easy: 

Lemma 2. If E \- u : t is derivable in S , then for every S-type substitution 9, 
E9 \- u : t9 is derivable in S . 

If r \- u : T is derivable in Suj, then for every Suj-type substitution 9, E9 h 
u : t9 is derivable in Suj. 

If r~ \- u : is definite positive and derivable in Suj, then for every S-type 

substitution 9, E~9 h u : t~^9 is definite positive and derivable in Suj. 
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Lemma 3. Every SKlnT -normal term u has a typing in system S. Every spine- 
normal SKInT-ierm u has a typing in Stu. 

Proof. We first observe that every type can be written uniquely po ^ ^ 

Pn-i b, where b G B. We call n the arity of the type. By extension, we call 
arity of a typing E \- u \ t the arity of r. We call a typing as above S -normal 
if n > 1 and Pn-i = [b'] with E a base type other than b; we call it Stu-normal 
if n > 1 and Pn-i = w. 

Let the degree d(u) of a SKInT-term u be defined by: d(x) =d/ 0, d{Ii) =d/ 
£ + 1, d{S(,{v,w)) =df d{Ki{v)) =d/ .^+1. We show the more general claims 
that: (i) every SKInT-normal term u has a normal S'-typing of arity d{u) + 1, 
and: (ii) every spine-normal term u has a normal S'w-typing of arity d{u) -\- 1. 
This is by structural induction on u. 

Notice first that: (*) whenever a term u has an S'-normal, resp. 5'w-normal 
typing, then it also has S'-normal, resp. Sw-normal typings E \- u \ t of arbitrary 
higher arities in the same system: indeed, if u has a normal typing T h u : r of 
arity n as above (with b the base type at the end), then it also has a normal 
typing of arity n-|- 1, namely E9 h u : t9, by Lemma 2, with 9 =d/ [\b”] b' /h], 

b" ^ b' in the case of system S, 9 =df [uj ^ b' /b] in the case of Suj. Claim (*) 
then follows by an easy induction on n. 

If u is a variable, then the normal typing x : [b'] ^ b\- x : [6'] ^ b establishes 
(f), and x:uj^b\-x:(jj^b establishes {ii). If u is of the form Ii, then we can 
choose the typing \- I( : po ^ ... ^ pi-i [[6'] ^ 6] — > [6'] ^ b with po, . . . , 
Pi -1 any S-types and b' ^ b for (i), and \- Ii : po ^ ... ^ pt-i ^ ^ f>] ^ 

to ^ b for {ii). 

When u is of the form Si{v, w), then, first, define the conjunction E' A E" of 
two contexts E' and E” as the collection of bindings x : p' f\ p” (when x : p' G E' 
and X \ p" G r”), x : p' {if X : p' G E' but x does not appear in E”), and x : p” 
(if x : p" G r" but x does not appear in E') . Notice also that (by examination of 
the reduction rules), if u is spine-normal (in particular, normal), then d{v) < I. 
We show claim (f) as follows: by induction, v has an S'-normal typing of arity 
d{v) -I- 1 < £ -I- 1, hence by (*) it has an S-normal typing of arity exactly £ -I- 1, 
say E' G V : Pq ^ ... ^ Ei-i W] b; by induction hypothesis again, w 

has an S-typing, hence by (*) we may assume w.l.o.g. that w has an S-typing 

of arity at least £, say E' G w : p'f, ^ ... ^ h-i-i then, we can derive 

r'[T"/b'] G V ■. Po ^ ... ^ Pi -1 [t"] ^ b, where pi =d/ p{[t” / b'], 0 < i < n, 

by Lemma 2; then T'[r"/6'] A E” G u : po ^ ... ^ Pi-i — > 6 is an S-typing of 
arity d{u) = £; substituting b for, say, [b”] b'" (using Lemma 2), we get the 

required S-normal typing of arity d{u) -\- 1. Showing {ii) is easier: by induction, v 
has an Sw-normal typing of arity d{v) -I- 1 < £ -I- 1, hence by (*) it has an Suj- 
normal typing of arity £ -I- 1, say E' G v : Pq ^ . . . ^ Ei-i ^ uj ^ b; then 
r G u : po ^ Pi -1 —>■ to —>■ b' is the required Sw-normal typing of arity 

£ -I- 1, where E =d/ E'[lu — > b'/b], and pi =d/ Pi[to — > E/b], 0 < i < i. 

When u is of the form Ki{v), then observe that (by examination of the 
reduction rules), if u is spine-normal (or normal), then d{v) < £. We show {i) as 
follows: by induction v has an S-normal typing of arity d{v) -\- 1, hence by (*) 
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an S'-normal typing P' h f : /ig — > . . . — > — > [b'] —^boi arity £ + 1; then 

r' \- Ki{v) : /Xg — > . . . ^ — *■ M ^ ^ is the required S'-normal typing 

for u, for any S-type And (ii) follows by a similar construction, replacing [b'] 
by Lu and letting fi be any Sw-type. □ 

Contrarily to what happens in the A-calculus, types in Slo are not preserved 
by the inverse of SKInT-reduction. However: 

Lemma 4. If u — v in SKInT^, and F \- v \ t in Slo, then F9 \- u : t 6 for 
some S-type substitution 9. 

Proof. More concisely, we shall say that whenever v has some Sw-type r, then u 
has type t9 for some S-substitution 9 (where the contexts F and F9 are under- 
stood). We first show this when u = I, v = r, and I r is any of the reduction 
rules. There are only three interesting rules: 

— (Sli): I = r = w. If T has arity n > £, then r is of the form /rg ^ 

. . . ^ pt-i r', and we can take If, of type /rg — > . . . ^ /i^-i ^ \t'] t', 

so that I has type r. If r has arity n < i, then let r be /tq ^ . . . — > fin-i — *■ b, 
and 9 be [t' / b], where r' is any 5-type of arity at least £ — n: by Lemma 2, r 
has type t9, and since t9 has arity at least £, then as above I has type t9. 

— {SK(): I = St(Ki{u),w), r = u. li T has arity n > £ -|- 1, then write r as 

po ^ ^ Pi -1 Pi ^ t'\ we can give Ki{u) the 5o;-type po ^ ... ^ 

Pi -1 — > w — > /i^ — > r', and therefore I has type t as well (notice that we 
need not give w a type). Otherwise, as in the {Sli) case, r and I have type 
t9 for some 5-substitution 9 of arity £ -|- 1 — n. 

— (rjSi): I = Si+i{Ki{u), li)^ r = u. If the type r of r has arity n > £ -|- 1, then 
write T as po ^ ... ^ Pi ^ t' , and pi as [ri, . . . , r^]. We can give li all the 
types po ^ ... ^ Pi -1 ^ Pi ^ Ti, 1 < i < k, and we can give Ki{u) the 
type po ^ ... ^ Pi -1 ^ Pi ^ [ti, ... ,Tk] ^ r', so I can be given type r 
again. If n < £ -|- 1, then, if r is of the form /ig ^ . . . — > pn-i b, let 9 be 
[r'/6] for any 5-type t' of arity at least £ -I- 1 — n: then I has type t9. 

For all the other rules, we can choose the identity substitution for 9 (see [11]). 

We now claim that whenever u — > v — namely, when u = C[l], v = C[r], 
and Z — !■ r is some reduction rule — and v has type r in Slo, then u has some 
type T 9 in Sco, where 9 is an 5-substitution. This is a straightforward structural 
induction on C, using Lemma 2. The Lemma then follows by induction on the 
length of the reduction u — v. □ 

Theorem 2. Every solvable SKInT-ierm has an Su-typing. Every 
SKhiT -weakly normalizing, resp. SKInT^-weafeZ?/ normalizing term has a defi- 
nite positive Sw-typing. 

Proof. Let u be solvable. Then u — v, where v is spine-normal; by Lemma 3, v 
has an 5w- typing F \- v : t; hy Lemma 4, u has a typing of the form F9 u : t9, 
where 9 is an 5-substitution. This is clearly an 5o;-typing. 

On the other hand, if u is weakly normalizing (in SKInT or in SKInT^), then 
u — V for some SKInT-normal term v, by Lemma 3, v has a definite positive 
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S'w- typing r'” h w : r’*'; by Lemma 4, u has a typing of the form r~9 h u : t~^9, 
where 9 is an S'-substitution. By Lemma 2, this typing is therefore not only an 
S'w-typing, but is also definite positive. □ 

In fact, the proof even shows that every weakly normalizing term, whether 
in SKInT or in SKInT^, has an Sw-typing where oj does not occur at all (but oj 
may occur in the typing derivation). This is similar to [8], Theorem 6.12. 

Theorem 3. If u is strongly normalizing in SKInT, resp. SKInT,;, then it has 
an S-typing. 

Proof. As u is strongly normalizing, any normalization strategy terminates. 
Choose any innermost strategy, i.e. any strategy that reduces only redexes whose 
strict subterms are all normal. (In particular, the redex Si{Ki(ui),U 2 ) can only 
be reduced when ui and Ki(ui) are normal.) Let jz(m) denote the length of the 
longest reduction sequence in SKInT starting from u according to this strategy. 

We show the claim by induction on v{u). If v(u) = 0, then this is by Lemma 3. 
Otherwise, assume that u — > v (so that v{v) < v{u), hence by induction v has 
an S'-typing P \- v \ t).H the reduction from u to r; is by any rule except (SKi), 
£ > 0, then u has an S'-typing P9 \~ u : t9 , where 9 is an S-substitution: this is 
as in the proof of Lemma 4. 

In case u — v by (SKf), this does not work any longer, since we cannot use to 
(not an S-type). Instead, observe that u can be written as C[S^(K^(mi), M 2 )], and 
that V = C[mi]. By induction hypothesis, and since ^{u^) < v{u), U 2 has an S- 
typing T 2 h M 2 : T 2 , and w.l.o.g. we may assume that T 2 has arity at least £, 
i.e. that T 2 = /Xq ^ . . . — > t" . Since the chosen reduction strategy is 
innermost, u\ is normal, and by Lemma 3 (more precisely, by Claim (z) in its 
proof). Ml has an S'-normal typing of degree exactly d{u\) + 1; but since the 
reduction is innermost again, Ki(u\) is normal as well, so d(ui) < £, and (by 
Remark (*) in the proof of Lemma 3) therefore mi has an S'-normal typing of 
arity £ + say v Ti h mi : — > . . . ^ l^'i-i [^1 b. So we may now 

derive P h S^(A:^(mi), M 2 ) : /xq ^ ^ ^ b, where F =d/ Pi[P'/b'] A T 2 , 

/Xj =d/ lb']£\pt-'l for each z, 0 < x < ^. It follows that u itself has an S-typing 

(which is an instance of the latter), by a straightforward induction on C. 

The case of SKInT,;-strongly normalizing terms is completely similar. □ 

Corollary 2. The following equivalences hold, for any SKInT-term u: 

— u solvable <tA u typable in Suj 

— u S}FhiT -weakly normalizing 4A u SKInT,; -zxieafcZy normalizing 

<G> M has a definite positive Sui-typing 

— u S^FhiT -strongly normalizing u SKInT ,; normalizing 

<tA M typable in S. 

4 Conclusion 

We have shown that SKInT enjoyed exactly the same properties as the A- 
calculus, as far as the relationship between conjunctive types and various normal- 
ization properties is concerned. This implies that SKInT preserves solvability. 
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weak normalization and strong normalization, for any reasonable, i.e. typability- 
preserving translation from the A-calculus to SKInT, in particular for the L* 
translation of [9]. 

It is then interesting to compare SKInT with other calculi of explicit substi- 
tutions. Indeed, although SKInT was not presented as a calculus of explicit sub- 
stitutions in [9], it is definitely so: li is de Bruijn index £ > 0, and the de Bruijn 
substitution [0 := ?;o, • ■ ■ , := Vn] applied to u is (. . . ((u oq vq) oi vi) . . .) o„ Vn, 

where the operator is defined by: uo^v =d/ Si+i{Ki{u),v), as the reader is 
invited to check. Then ET plays the role of the substitution calculus, and j3I 
(i.e., (Slf,)) more or less plays the role of the f3 rule (the connection is not as 
direct as in more traditional calculi of explicit substitutions, though). 

Together with the results of [9] , this advances the table of properties of calculi 
with explicit substitutions proposed by Lang and Rose [16]^ to the following 
(D (3 means “simulates /3-reduction”, CRM is “is Church- Rosser in the presence 
of meta- variables” , SSN is “has a strongly normalizing substitution subcalcuhis” , 
PSN means “preserves strong normalization”). 



Name 


# 


2/3 


first-order? 


unconditional? 


CRM 


SSN 


PSN 


Reference 


Act 


11 


yes 


yes 


yes 


no 


yes 


no 


[1] 




21 


yes 


yes 


yes 


yes 


yes 


no 


[6] 


\v 


8 


yes 


yes 


yes 


no 


yes 


yes 


[17] 


As 


(X) 


yes 


yes 


yes 


no 


yes 


yes 


[13] 


ASe 


oo 


yes 


yes 


yes 


yes 


7 


no 


[14] 


AC 


13 


no 


yes 


yes 


yes 




yes 


[19] 


Axd 


oo 


yes 


no 


no 


yes 


yes 


7 


[16] 


Ax^ci 


oo 


yes 


yes 


no 


no 


yes 


7 


[16] 


SKIn 


oo 


yes 


yes 


yes 


yes 


no 


no 


[9] 


SKInT 


oo 


yes 


yes 


yes 


yes 


yes 


yes 


[9] 


A/3 


oo 


yes 


no 




yes 




yes 


Church, see [3] 



The only defect we know of SKInT is that it encodes the A-calculus in a 
slightly complicated, and non-unique way [9]: e.g., you may wish to use the L* 
translation, or any other. We believe that this drawback is offset by the fact 
that SKInT has basically all the properties of the A-calculus: confluence, even 
on open terms; standardization; strong normalization of simply-typed terms; a 
conservative extension of the A-calculus with /3, resp. /37?-conversion (see [9]); 
preservation of solvability, weak and strong normalization; characterization of 
solvable, weakly and strongly normalizable terms by conjunctive typings (this 
paper); all this in an infinite but regular first-order equational formulation. 

Further work should investigate whether easier, direct proofs of the results 
presented here are possible, using variants of the reducibility method [15,8]. 



F. Lang notes that some of the results of this paper are wrong, and we have followed 
his remarks in the table (see http://www.ens-lyon.fr/~flcing/papiers.html). 



1 



120 Jean Goubault-Larrecq 



References 

1. M. Abadi, L. Cardelli, P.-L. Curien, and J.-J. Levy. Explicit substitutions. In 
POPL’90, pages 31-46, 1990. 106, 119 

2. S. van Bakel. Complete restrictions of the intersection type discipline. Theoretical 
Computer Science, 102(1):135-163, 1992. 110, 115 

3. H. Barendregt. The Lambda Calculus, Its Syntax and Semantics, volume 103 of 
Studies in Logic and the Foundations of Mathematics. North-Holland, 1984. 107, 
119 

4. F. Cardone and M. Coppo. Two extensions of Curry’s type inference system. In 
P. Odifreddi, editor. Logic and Computer Science, volume 31 of The APIC Series, 
pages 19-76. Academic Press, 1990. 106, 109 

5. P.-L. Curien. Categorical Combinators, Sequential Algorithms and Functional Pro- 
gramming. Pitman, London, 1986. 106 

6. P.-L. Curien, T. Hardin, and J.-J. Levy. Confluence properties of weak and strong 
calculi of explicit substitutions. J. ACM, 43(2):362-397, 1996. 119 

7. H. B. Curry and R. Feys. Combinatory Logic, volume 1. North-Holland, 1958. 106 

8. J. Gallier. Typing untyped A-terms, or reducibility strikes again! Annals of Pure 
and Applied Logic, 91(2-3):231-270, 1998. 110, 118, 119 

9. H. Goguen and J. Goubault-Larrecq. Sequent combinators: A Hilbert system for 

the lambda calculus. Mathematical Structures in Computer Science, 1999. 106, 

107, 108, 109, 110, 112, 115, 119 

10. J. Goubault-Larrecq. On computational interpretations of the modal logic S4 Illb. 
Confluence and conservativity of the AevQ^f-calculus. Research report, Inria, 1997. 
112 

11. J. Goubault-Larrecq. A few remarks on SKInT. Research report RR-3475, Inria, 
1998. 107, 110, 112, 113, 117 

12. T. Hardin and J.-J. Levy. A confluent calculus of substitutions. In France-Japan 
Artificial Intelligence and Computer Science Symposium, 1989. 106 

13. F. Kamareddine and A. Ri'os. A A-calculus a la de Bruijn with explicit substitu- 
tions. In PLILP’95, pages 45-62. Springer Verlag LNGS 982, 1995. 119 

14. F. Kamareddine and A. Rios. Extending a A-calculus with explicit substitution 
which preserves strong normalisation into a confluent calculus on open terms. Jour- 
nal of Functional Programming, 7:395-420, 1997. 119 

15. J.-L. Krivine. Lambda- calcul, types et modeles. Masson, 1992. 110, 119 

16. F. Lang and K. H. Rose. Two equivalent calculi of explicit substitution with con- 
fluence on meta-terms and preservation of strong normalization (one with names 
and one first-order). Presented at WESTAPP’98, 1998. 119 

17. P. Lescanne and J. Rouyer-Degli. From Ac to Aw: a journey through calculi of 
explicit substitutions. In POPL’94, 1994. 106, 119 

18. P.-A. Mellies. Typed lambda-calculi with explicit substitutions may not terminate. 
In TLCA’95, pages 328-334. Springer Verlag LNGS 902, 1995. 106 

19. C. A. Munoz Hurtado. Gonfluence and preservation of strong normalization in an 
explicit substitutions calculus. In LICS’96. IEEE, 1996. 106, 119 

20. P. Salle. Une extension de la theorie des types en A-calcul. In 5th ICALP, pages 
398-410. Springer Verlag LNGS 62, 1978. 106, 109 

21. E. Sayag. Types intersections simples. PhD thesis, Universite Paris VII, 1997. 110, 
115 



Modular Structures as Dependent Types in 

Isabelle 



Florian Kammiiller 

Computer Laboratory, Uiriversity of Cambridge 



Abstract. This paper describes a method of representing algebraic 
structnres in the theorem prover Isabelle. We use Isabelle’s higher or- 
der logic extended with set theoretic constrnctions. Dependent types, 
constructed as HOL sets, are used to represent modular structures by 
semantical embedding. The modules remain first class citizen of the logic. 
Hence, they enable adequate formalization of abstract algebraic struc- 
tures and a natural proof style. Application examples drawn from ab- 
stract algebra and lattice theory — the full version of Tarski’s fixpoint 
theorem — validate the concept. 



1 Introduction 

The initial aim of this research was to find a module system for the theorem 
prover Isabelle where modules are first class citizens, i.e. have a representation 
in the logic. This seems important when we want to formalize (mathematical) 
theories in which abstract entities are contained. Examples for such theories are 
common in abstract algebra. For example, in group theory we define groups as 
abstract objects. A group can be represented by a signature and axioms, but it 
is at the same time a logical formula; we say “G is a group”. Other examples 
include formal methods of computer science where we have abstract notions like 
schemas or abstract machines. 

In classical approaches modules for theorem provers are outside the logic: 
they do not have a logical representation, instead serve an efficient organization 
of theories. Nevertheless, most of the theorem provers that have powerful module 
systems (e.5.[OSR93,FGT93,GH93]) suggest to use their modules as represen- 
tations for (algebraic) structures. Although the encapsulation and abstraction 
achieved by packaging structures into modules is sensible, it does not constitute 
an adequate representation. This becomes obvious once one leaves the scope of 
toy examples [KP99]. 

We try to combine the convenience of the representation of algebraic struc- 
tures as modules with a sound logical treatment of modular structures avoiding 
any restrictions of reasoning. Using an extension of Isabelle HOL with a no- 
tion of sets we define dependent types as sets in order to find an embedding of 
signatures in the logic. Using this embedding we can represent abstract alge- 
braic structures as such dependent “types” . Furthermore, we use the very recent 
concept of record types in Isabelle [NW98] to represent the element patterns of 
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algebraic structures. For example, we represent a group as a record with four 
fields: the carrier set, the binary operation, the inverse, and the unit element. 
The class of all groups is represented by a HOL set over this record type. 

This paper first explains our notion of algebraic structures and gives examples 
in Sect. 2. In Sect. 3 dependent types and their formalization as sets in Isabelle 
HOL are introduced. Their application to represent structures is described. Ex- 
amples of abstract algebra and the full form of Tarski’s fixpoint theorem validate 
the construction of the concept of algebraic structures in Sect. 4. Finally, we dis- 
cuss some related work and draw some conclusions in Sect. 5. 



2 Algebraic Structures 

By an algebraic structure we mean a set of concrete mathematical objects that 
are described as an abstract object. That is, an algebraic structure is a set 
of concrete entities which are considered to be similar according to a bunch 
of characterizing rules and a general pattern of appearance, while abstracting 
from other concrete characteristics. Examples for algebraic structures are groups, 
rings, homomorphisms, etc. The algebraic structure of groups, say, is the class 
of all concrete examples of groups. Hence, the structure is formed by abstracting 
over elements of similar appearance that fulfill common properties. 

We understand program specifications, definitions of formal languages, finite 
automata, and the like, also as algebraic structures. Certainly, logical theories 
can as well be seen as algebraic structures, but it is not our aim to express logics 
like that. In some respects our view of algebraic structures corresponds to the 
notion of “Little Theories” [FGT93] in IMPS, but does not try to capture the 
notion of a logical theory. 

In this section we characterize our notion of simple algebraic structure and 
higher order structure. We use an informal notion of signature instead of modules 
because that is what the latter basically are. We do not use a separate syntactical 
description language for those signatures because we think that for the encoding 
of mathematical structures our method of direct encoding in HOL sets plus 
dependent types is sufficiently self explanatory. 

In the following we will talk about structures as sets of objects. We are using 
the set notion of Isabelle HOL as a foundation for this work. This notion is 
defined in terms of predicates and is thus — in a set-theoretic sense — rather a 
notion of classes than sets. 



2.1 Simple and Higher Order Structures 

An algebraic structure is a set of mathematical objects. They can be syntactically 
represented by their signature, i.e. by the arities of their elements and the rules 
which hold for the elements of the structure. An object matching the arities and 
fulfilling the rules is an element of a structure. A syntactical description of a 
structure S by its signature and related rules is of the form: 
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signature . . . ,Xn) 

xi € Ai 

X„ e An 

Pi 



where the Ai are the arities of the parameters, I € {1, ... A:}. The Pk are proper- 
ties in which the parameters Xi can occur, /c S {1, . . . , m}. The arities can denote 
types or sets depending on the framework. This syntax is a simplified form of 
the style of modules as seen in other theorem provers [OSR93,FGT93,GH93]. 

The associated meaning of this syntactical description of signature S is what 
we consider as an algebraic structure 

15] = {{xi, . . . ,Xn) & Ai X . . . X An \ Pi A ... A Pm} 

where in Pi any of {a;i, . . . , Xn} can occur. We call the elements xi, . ■ ■ ,Xn pa- 
rameters of the structure S. 

Structures may possibly be parameterized over other structures. We call such 
structures higher order structures in contrast to simple structures. To identify 
the structures that are parameters of higher order structures, we use the term 
parameter structures, and the structure, which is defined by the higher order 
structure itself, we call image structure. 

For the definition of simple structures, we use sets of extensible records. 
Record types are used as a template for the pattern of appearance of the struc- 
ture’s elements. They give us the selectors, which are projection functions en- 
abling reference to the constituents of a simple structure. Although we use exten- 
sion of record types to describe how more complex types are built from simpler 
ones, e.g. rings from groups, we do not use the extensibility. The latter is a 
feature of extensible record types that enables to model “late binding” [NW98] 
which is not needed in the way we model structures. 

The definition of higher order structures needs a device to refer to the formal 
parameters. Here we employ the set theoretic construction of dependent types. 
It enables the use of constraints on parameter structures in the definition of an 
image structure. The selectors of the parameter structures admit to refer to their 
constituents. 

For the parameter tuple par = {x\, . . . , x„) of a simple structure we define a 
record a par-sig as^ 



record a par-sig = 

_ .{xi) :: Ai (postfix) 

_ .{Xn) ■■ An (postfix) 

^ We assume here syntax definition possibilities that are planned for records though 
not yet available 
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The underscore defines the argument positions of the field selectors of this record. 
For example, if T is a term of appropriate record type, i.e. a suitable n-tuple, 
we can select the field Xj of T by T.{xj). In general, the elements of the tuple 
will have names like carrier, or inverse, so the naming discipline of the record 
field selectors that we chose, is more informative than indexing by numbers. 

The representation of a simple structure is given as a set of records; the 
record type defines the element pattern of the structure. 



2.2 Example: Groups and Homomorphisms 

A group is constituted by a carrier set and a binary function o on that set, 
such that the function o is associative, and for every element x in the carrier 
there exists an inverse x. The carrier set also contains a neutral element e. The 
syntactical representation of a group by a signature is 



signature 

o G 

inv G 

e G 

V X £ cr. 

V X £ cr. 
Vx, y,z € cr. 



Group (cr, o, inv, e) 
cr y. cr ^ cr 
cr cr 
cr 

X o e = X 

X o (inv x) = e 

X o (y o z) = (x o y) o z 



According to Sect. 2.1, the mathematical meaning that we associate to this 
example is 

I Group ] = Id cr, o, inv, e [) | o G cr x cr — » cr A inv £ cr ^ cr A e £ cr A 

(Vx £ cr. X o e = x) A (Vx £ cr. x o inv(x) = e) A 
(Vx, y,z £ cr. X o (y o z) = (x o y) o z)} 



The notation (| cr,o,inv,e D of the elements of this set stands for an extensible 
record term. In this context it is sufficient to understand them as products. The 
base type of the set Group is defined by the following record definition^. 

record a group-sig = 

_ .(cr) :: a set (postfix) 

_ .(f) :: [ct,a] => a (postfix) 

_ .(inv) ■.■. a ^ a (postfix) 

_ .(e) :: a (postfix) 



The structure Group is of type ( a group-sig ) set. In the following example of a 
higher order structure for group homomorphisms, we see how the field selectors 
are used to refer to the constituents of a group. 

A homomorphism of groups is a map from one group to another group that re- 
spects group operations. The parameters of a structure Horn for homomorphisms 
are groups themselves, i.e. we have a higher order structure. The following syn- 
tactical form encloses the parameter structures in square brackets. 

^ In the remainder of this paper we name the group operation /, instead of o, because 
we need to refer to the group G and in prefix notation G.(f) looks more natural. An 
improvement of syntax for such implicit references is given by locales (c.f. Sect. 5) 
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signature Horn [G,H £ Group ] ( 

$ £ G.{cr) H.{cr) 

yx,y £ G.{cr). <P{G.{f) x y) = H.{f) <P{x) <P{y) 

In the definition of the mathematical structure we have to add “where G and H 
are elements of the structure Group‘d . That is, a mathematical object represent- 
ing a homomorphism between groups has to carry also the two groups in itself. It 
is a triple (G, H, <1>) of two groups and a homomorphism between them. The ho- 
momorphism depends on the elements G and H . In the image structure we need 
to refer to the parameter structures G and H. Hence, we choose a dependent 
type, the 27-type, to define the structure for homomorphisms. 

Horn = EceOroup HneGroup | ^ € G.(cr) ^ H.{cr) A 

{\/x,y £ G.{cr). <P{G.{f) x y) = H.{f) <P{x) ^{y))} 

Now the parameter groups G and H are bound by the S operator and we can 
refer to them, and their constituents by using the projections, e.g. G.{f). In 
the following section, we explain the notion of dependent types, and how we 
represent them using set theoretic constructions. 

3 Dependent Types as Structure Representation 

The textbook introduction to type theory [NPS90, page 52] explains the main 
reason for the introduction of the U-set as the interpretation of the universal 
quantifier. The Heyting interpretation of this quantifier is [Hcy56] 

Va; £ A.B{x) is true if we can construct a function which when applied 
to an element a in the set A, yields a proof of B(a). 

The dependent sum B enables to deal with the existential quantifier, i.e. 3 can 
be defined as 

3a; £ A.B(x) = Ex^aB{x) 

We use the dependent sets in the same sense, but restrict the use to the descrip- 
tion of structures. We consider A and B as structures, and not general formulas. 
So, we use the dependent sets as type theory uses them, but in a more naive way 
restricting ourselves to the statements x £ A and not interpreting this as “x is 
a proof of formula H” . 

The idea is to use the syntactical signature description of the structure as a 
set B{x) — with a formal parameter x. This formal parameter is an element of 
the first set A. In case of more than one parameter structure the nesting of the 
dependent type constructors E and II just accumulates. 

We show in this section how dependent types are formalized in HOL and 
how this formalization can be used to represent higher order structures. 

3.1 Isabelle Representation 

The system Isabelle HOL implements a simple type theory [ChnlO] and has 
no dependent types. The object logic HOL of Isabelle is extended by a notion 
of sets. Sets are here essentially predicates, rather than “built-in” by ZF-style 
axioms. We use this extension to define dependent types as sets in Isabelle. 
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Set Representation One can consider the i7-type as a general form of the 
Cartesian Product. If we represent Ex^aB{x) as a set, it is thus 

d7a:GAS(a;) = y y {(a;,y)} 

x£A y£B(x) 

This representation of the H-type is used in HOL. 

The iT-type is the type of dependent functions. It is related to the H-type. 
We can express this type as a set by considering the subsets of E which can be 
seen as functions 

IIx^aB{x) = {/ € V{E^^aB{x)) \ \/x & A.3\y € B{x).f{x) = y} (1) 
where V denotes the powerset. 



Implementation in Isabelle In the distribution of Isabelle HOL the Ll-type 
is already defined in terms of HOL sets, the 77-type not. 

The most natural way to define 77 seems to be to use definition 1 defining 77 
in terms of E . But, then the functions we would get would be sets of pairs and 
we would develop a new domain of functions inside HOL, when there are already 
functions. 

The existing functions in HOL are the elements of the function type a /3, 
where a and (3 indicate arbitrary types, and is the function type construc- 
tor. There is a notation for A-abstraction available, which allows to define new 
functions. We would like to define function sets, i.e. sets of elements of the HOL 
type a /3, and on top of that we want to have that the co-domain of these 
functions (3 may depend on the input to the function. Ideally, the type (3 should 
depend on some x of type a. Since HOL does not have dependent types, it is 
impossible to integrate the dependency at the level of types. But, we can define 
a non-dependent type for the constructor 77 as 

[a set, a f3set] (a (3) set 

Then, we can assign the above type to a constant 77 in HOL and add the idea 
of dependency to the definition of this constructor. 

Ux^aB{x) = {/ I V®. \i x & A then f{x) € B{x) else f{x) = (@y.True)} 

By using the more explicit language of sets we achieve that the codomain is a 
set which depends on the argument to the function. The “else” case is necessary 
to achieve extensionality for the 77-sets. 

The non-dependent function sets are a special case of this definition of 77. 
Using Isabelle’s pretty printing facilities, we get a nice syntactical representation 
for that and can now write H > 77 for the set of functions from a set H to a 
set 77. 

What we are doing here is classification. Equality compares functions accord- 
ing to their behavior on the set A. That is, we do not care about what a function 
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in Ux^aB{x) does outside A. We want to think of all functions which behave 
alike on A as the same function. 

To reassure ourselves that the definition of II is sound we have established 
a bijection II Bij between the classical definition from Equation 1 and the above 
HOL function set as 

IlBtjAB = A/ € II^^aB. {{x,y) \ x € A A y = fx} 

We proved in Isabelle that this map is actually a bijection. 

3.2 Algebraic Formalization with U and S 

We concentrate here on the representation of higher order structures. As already 
pointed out in Sect. 2 we use sets of records for simple structures. We use the 
dependent type constructors S and II to represent higher order structures, that 
is, to express structures, where the parameters are elements of structures them- 
selves. Roughly speaking, the A7-types are used for general relations between 
parameter and image structure. When this relation is a function, i.e. the con- 
struction of the image structure is unique and defined for all elements of the 
parameter structure, then we can construct elements of the higher order struc- 
ture using the A-notation. In that case, the higher order structure is a set of 
functions, i.e. a il-type structure. 

Use of S The interpretation of the A7-type is that of a relation between pa- 
rameter and image structure. Higher order structures whose image structures 
are defined for certain input parameters, but not necessarily for all, can be rep- 
resented by S. So, the elements of these higher order structures are pairs of 
parameter and image structure elements; for a structure as Struc = Sx^aB{x), 
we can write this membership as (a, h) G Struc. 

But, we also want to instantiate the structure By Struc J, a we annotate 
the instantiation or application of the structure. What we are interested in is to 
get an instance of the image structure B, where a is substituted for the formal 
parameter x. That is, we want to derive B{a) for a € A, or apply the entire 
structure generally to an element of the parameter structure. For a G A we 
construct an operator | such that {Sx^aB{x)) | a evaluates to B{a). We can 
define | in terms of the image of a relation I m, so it reduces to 

Struc i a = {S^^AB{x)~~{{a}) = {y \3x G {a}.{x,y) S S^^aB{x)} 

Then we can use the theorem 

(a, b) e ExdAB{x) ^ b G B{a) 



to derive 



a G A Struc i a = B{a) 

This theorem enables us now to build the instance of a higher order structure 
with an element a of the parameter structure A. 
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Use of n Elements of higher order structures which are uniquely defined — like 
FactGroup in Sect. 4.2 — can be represented by a function definition in the typed 
A-calculus from Sect. 3.1. These A functions are elements of the corresponding 
il-set. Let elem = {Xx G A. t{x)), then 

elem G IIx^aB{x) iff Va G A. t{a) G B(a) 

The function body t of the element elem constructs elements of the image struc- 
ture of the higher order structure to which elem belongs. For the application or 
instantiation we do not need an extra operator as for S. Since we have defined 
the n type as sets of functions we can use the HOL function application elem{a). 
li a G A then this evaluates to t{a). 

The proof that the body t{a) for a G A is actually in some image struc- 
ture B{a), can be nontrivial. For the examples in Sect.s 4.2 we have to show 
that the constructed images are groups. 

In principle one can define structures that are universally applicable to pa- 
rameters directly by IIx^aB{x). For example, we may use U instead of S to 
encode the structure of group homomorphisms, because for all groups G and H 
there is always a homomorphism between G and H. This idea is discussed else- 
where. The use of U as general representation for higher order structures in the 
described sense is more complicated than E. 

4 Application Examples 

In this section we present some examples of abstract algebra and lattice theory 
which we performed in Isabelle to validate the concept introduced in this paper. 
We give outlines of the corresponding definitions of the algebraic structures and 
present the results. We do not display the proofs because they are too long. 

4.1 Definitions 

We start from the definition of groups and homomorphisms as sets of records 
given in Sect. 2.2. The notion of a subgroup uses dependent types. It is a higher 
order structure, because we define subgroups as subsets of a group which are 
themselves groups. This way of definition in terms of the structure Group is 
only possible because structures are first class citizens and can hence be used in 
formulas. 

consts 

subgroup :: (a group-sig x aset)set 

defs 

subgroup = EaeGToupiH.H C G.{cr) A 

d H, Aa; G H.Xy G H. G.{f) xy,XxG H. G.{inv) x, G.{e) D G Group} 

From this definition of the structure subgroup we can derive classical theorems 
about subgroups. For example, we derive that it is sufficient to show that a 
subset H oi & group G is closed under the group operations, in order to infer 
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that H is a, subgroup of G (subgroup introduction rule). Using the pretty printing 
facilities of Isabelle we define the abbreviation H «= G to annotate that H is 
a subgroup of G. 

Rings are defined in a similar manner as groups. Assuming the definition of 
group homomorphisms Horn from Sect. 2.2, group automorphisms can now be 
defined as homomorphisms from one group to the same group such that these 
functions are injective on the carrier of the group, 
consts 

GroupAuto :: ( a group-sig x (a => a))set 

defs 

GroupAuto = £'GeGroup.{^ | (G, G, £ Horn A 

inj-on G.<cr>^ A $(G.{cr)) = G.(cr)} 



4.2 Proof Examples 

Group of Bijections We define the set of bijections Bij and a record BiJGroup 
consisting of the set of bijections over a set S', the composition of these bijections, 
the inverse of a bijection and the identical bijection. 

Bij S' = {/ I / £ S ^ S A /(S) = S A inj_ons/} 

BijGroup S = ( BijS, \g £ BijS.A/ £ BijS.g og /, 

A/ £ BijS.Aa; £ S.(Invsf) x, \x £ S.x P 

We can show that this record is in the set Group, i.e. that the bijections together 
with the listed operations on them are a group. 

BijGroup S £ Group 

Group of Ring Automorphisms We use a definition of ring automorphisms 
RingAuto similar to group automorphisms (c.f. Sect. 4.1) as a higher order struc- 
ture. With this we show that the set of ring automorphisms is a subgroup of the 
group of bijections over the carrier of the ring. This proof is much simpler than 
showing that ring automorphisms are a group. That is, due to the subgroup in- 
troduction rule we derived in Sect. 4.1 it suffices to show closedness of the subset 
RingAuto I i? to derive 

R G Ring RingAuto J, R «= BijGroup(R. (cr)) 

Using the result that the set BijGroup is indeed a group, by unfolding the def- 
inition of subgroups, we obtain immediately from the former theorem that the 
ring automorphisms together with the appropriate operations are a group. 

R £ Ring => (] RingAuto f R, 

Xx £ RingAuto f R.Xy £ RingAuto f R. (BijGroup (R.(cr)).(/)) x y, 
Xx £ RingAuto 1 R.(BiJGroup(R. (cr)).(mu)) x, 

BiJGroup(R.(cr)).(e) [) £ Group 

The Isabelle proof code that produces this result is short; the proof is a one line 
command connecting the previously derived results. This theorem illustrates 
nicely how the first class representation of structures allows the reduction of the 
proposition and hence improves the proof process. 
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Factorization of a Group We define the factorization of a group by one of 
its normal subgroups as 

FactGroup = XG e Group. Aft € {H \ H <G}. 
d set_r_cosGff, 

{XX e setjrjzosaH.XY G setjr.cosGff.set.prodQXy), 

(AX G set_r_cosGff.set jnvoX), 

We use the abbreviations set_r_cosc^f for the set of all right cosets of H in G. 
The terms set.prod^r and setfinvc stand for the lifting of the group operation 
and inverse to functions on sets. The notation H <J G abbreviates that H is a 
normal subgroup of G. Furthermore, we define the convenient syntax G / H for 
the factorization FactGroup G H. With these preparations, we can prove that 
this factorization is again a group. 

G € Group f\ H <\G => G/H G Group 

This is equivalent to the structural proposition that the factorization of a group 
is a function mapping a group and an element of the set of normal subgroups of 
this group to another group. 

FactGroup G {IIaeGroup{H | Ff <1 G} — > Group) 



Direct Product of Groups Similar to the previous example we define the 
direct product of two groups as 

ProdGroup = AGi G Group. AG2 G Group, 
d Gi.(cr) X G2-{cr), 

X{xi,yi) G Gi.(cr) x G2-{cr).X{x2,y2) G Gi.{cr) x G2-{cr). 

( Gi.(jf) xi X2,G2.{f) yi V2 ), 
iX{x,y) G G\.{cr) x G2-{cr). ( Gi.{inv) x,G2-{inv) y ), 

( Gi.(e),G2.(e) ) \) 

We define the syntax (| Gl, G2 D for this direct product of two groups and derive 
that it builds again a group. 

Gl G Group A G 2 G Group d G1,G2 D G Group 



Full Tarski The fixpoint theorem of A. Tarski [Tar55] is well known in computer 
science. Yet the form of the theorem which is usually proved is an older version 
from 1928. This theorem says that the least upper bound of all fixpoints P 
of a monotonic function / over a complete lattice (A, C) can be obtained as 
\J{x G A I X G f{x)}. The dual is true for the greatest lower bound /\. Besides 
proving that, Tarski showed in the later paper that the set of all fixpoints P 
of / is itself a complete lattice. This second result is very well suited to illustrate 
the need for a proper structural representation, because it is proved by applying 
the first part of the theorem to the interval sublattice \\J Y, 1] for any subset Y 
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of P. So, our mechanized proof illustrates again the advantages of the present 
approach. 

The proof of this full version was first formalized by R. Pollack in 
LEGO [Pol90]. There partial equivalence relation have to be used, which make 
the proof quite hard to read. 

5 Discussion 

The approach to use dependent types as modular structures is a well known one 
(e.g. [Mac86]), but to represent these types as sets is new. The advantage lies in 
the first class property of the structures. As we have seen in Sect. 3.2, we can 
now define operations on modular structures in the logic. Others, like forgetful 
functors, theory interpretations, can be expressed in terms of those. This should 
be further illustrated in future work. 

LEGO [Bur90,LP92], an implementation of the Extended Galculus of Gon- 
structions [Luo90b] uses dependent types as theories [Luo90a]. Our work is simi- 
lar to this. Since we are constructing these types as sets our approach is different. 

One difference is that our dependent structures are terms of the logic not 
types as in EGG. The discussion section of [Luo90a, Sect. 4.4] mentions the 
possibility of a combination of two ideas: one is to have dependent structures as 
a representation of theories, done by EGG. The other idea is to have operations 
on theories, that is theories are values and there are operations that can be 
performed on those values. These concepts were examined in the specification 
language GLEAR [SB83]. Since the dependent types are values of the logic in 
our semantical embedding of theory structures, it is possible to define operations 
on theories as HOL functions. This has been illustrated in this paper by defining 
the operation of instantiation by the structure instance operation (c.f. Sect. 3.2). 

The other difference is that we are following the LGF-style of not considering 
proof objects. Thus, the actual proof construction leading to the results is inde- 
pendent of the type structure of the formalization. Nevertheless, the structures 
we use contain enough information to produce the instances one is interested in, 
as is illustrated by the example proof for the group of ring automorphisms in 
Sect. 4.2. 

Another experiment, worth examining, is to extend Isabelle’s meta logic — 
which is a fragment of higher order logic — to the extent that the structures 
presented in this paper exist generically and not just for the object logic HOL. 

In an earlier version of this work we used products as base type for simple 
structures. Due to a suggestion of P. Martin-L6f at the TYPES 98 workshop we 
now employ records, although only in addition to sets. 

Although the syntax definition possibilities in Isabelle are remarkably good 
they still can be improved. For example terms like G. (/) x y should be expressible 
as X o y. This is nontrivial because the reference to the element G is crucial. 
Nevertheless, we succeeded in designing a concept of locales for Isabelle [KW98], 
realizing this feature. Locales enable definitions depending on local assumptions. 
The additional use of locales with the structural concepts presented in this paper 
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achieves a satisfying style of abstract algebraic reasoning. Difficulties with S- 
types as theory representation, pointed out in [Polar], are overcome by this 
additional feature. 
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Abstract. Investigating soundness and completeness of verification cal- 
culi for imperative programming languages is a challenging task. Incor- 
rect results have been published in the past. We take advantage of the 
computer-aided proof tool LEGO to interactively establish soundness 
and completeness of both Hoare Logic and the operation decomposition 
rules of the Vienna Development Method with respect to operational 
semantics. We deal with parameterless recursive procedures and local 
variables in the context of total correctness. 

In this paper, we discuss in detail the role of representations for ex- 
pressions, assertions and verification calculi. To what extent is syntax 
relevant? One needs to carefully select an appropriate level of detail in 
the formalisation in order to achieve one’s objectives. 



1 Introduction 

We have taken advantage of the LEGO system [1] to produce machine-checked 
soundness and completeness proofs for Hoare Logic and the operation decomposi- 
tion rules of the Vienna Development Method ( VDM) . Our imperative program- 
ming language includes (parameterless) recursive procedures and local variables. 
We consider static binding and total correctness. This is one of the largest devel- 
opments in LEGO to date. Building on a comprehensive library it additionally 
consists of more than 800 definitions, lemmata and theorems. 

Our message to the designers and researchers of verification calculi is that 
conducting computer-aided soundness and completeness proofs is both a feasi- 
ble and profitable task. Our fundamental contribution has been to highlight the 
role of auxiliary variables in Hoare Logic. Usually, assertions are interpreted as 
predicates on states where free variables denote the value of program variables 
in a specific state. Variables which are unaffected by the program under consid- 
eration then take on the role of auxiliary variables. They are required to relate 
the value of program variables in different states. 

Our view of assertions emphasises the pragmatic importance of auxiliary 
variables. We have followed a proposal by Apt & Meertens to consider assertions 
as relations on states and auxiliary variables [2]. Furthermore, we stipulate a new 
structural rule to adjust auxiliary variables when strengthening preconditions 

* An earlier version appeared as LFCS Technical Report ECS-LFCS-98-393. 
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and weakening postconditions. This rule is stronger than all previously suggested 
structural rules, including Hoare’s consequence rule [3] and rules of adaptation. 
As a direct consequence of the new treatment of auxiliary variables, 

— we were able to show that Sokolowski’s calculus for recursive procedures [4] 
is sound and complete if one replaces Hoare’s rule of consequence with ours. 
In particular, none of the other structural rules introduced by Apt [5] (which 
lead to a complete but unsound system) are required. 

— We have clarified the relationship between Hoare Logic and its variant VDM. 
We were able to show that, contrary to common belief, VDM is more re- 
strictive than Hoare Logic in that every derivation in VDM can be naturally 
embedded in Hoare Logic. 

Deep versus Shallow Embedding. Traditionally, one defines syntax for ex- 
pressions and relative to this setup, one characterises syntax of a programming 
language and syntax of an assertion language. Then, one describes the meaning 
of every syntactic construct. This approach is known as deep embedding. Alter- 
natively one may shortcut this process and identify the syntactic representation 
with its denotation. This technique is known as shallow embedding. 

Related Work. The pioneering work on machine-checked soundness for Hoare 
Logic by Gordon [6] rests entirely on shallow embedding. Homeier [7] extends the 
soundness proof to a setting with mutually recursive procedures. His encoding is 
based exclusively on deep embedding. Nipkow [8] has been the first to conduct 
a machine-checked completeness proof for Hoare Logic dealing with simple im- 
perative programs in the context of partial correctness. This contains a mixture 
of shallow and deep embedding. Using similar representation techniques we have 
extended this work to recursive procedures and local variables. 

1.1 To What Extent Does Syntax Matter? 

Before deciding on the embedding technique, one ought to clarify the objectives 
of the machine-assisted development. This induces the level of detail in which 
one needs to analyse involved concepts. One of the central issues in formalising 
metatheory is to what extent syntax needs to be formalised. Technically, one has 
a choice of deep versus shallow embedding. 

A shallow embedding cuts down the work load and is therefore, at least for 
machine-checked developments, often the preferred approach. The drawbacks of 
shallow embedding are that 

1. one cannot exploit the inductive (syntactic) structure to prove properties. 

2. The representation of concrete examples is often more difficult to compre- 
hend. 

As the main contribution of this paper, we clarify the role of deep versus 
shallow embedding. In the setting of Hoare Logic, the choice of the level of em- 
bedding has a major influence in the work involved in setting up an appropriate 
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theory of substitutions. One needs substitutions on states, expressions and as- 
sertions. With a shallow embedding of expressions and assertions, substitutions 
can be expressed in terms of substituting the state space. 

We have investigated the metatheory of verification calculi. It was not our 
aim to show that a proof tool such as the LEGO system is suitable to verify 
concrete programs. Therefore, the second drawback was of little concern to us. 
Our strategy has been to employ a shallow embedding whenever possible. 

However, one needs to pursue soundness by induction on the structure of 
programs whereas completeness is conducted by induction of the derivation of 
correctness formulae. Hence, in light of the first drawback of a shallow embed- 
ding, one needs to insist on a deep embedding for programs and the notion of 
deriving correctness formulae. The main benefit of employing a shallow embed- 
ding for investigating the metatheory of verification calculi are that 

— we did not have to worry at all about substitutions in assertions; an otherwise 
daunting prospect [9,7]. 

— Completeness can only be established for an assertion language which is suf- 
ficiently expressive to denote all intermediate properties such as invariants. 
Employing a deep embedding of assertions, one would need to additionally 
explicitly construct syntactic representations for all possible intermediate 
assertions. 



1.2 Overview 

The outline of this paper is as follows. We first formalise the notion of a state 
space. We then sketch our embedding of expressions, assertions and imperative 
programs. In Sect. 7 we discuss semantics and derivability of Hoare Logic. We 
motivate new rules for loops and adjusting auxiliary variables. We argue that 
in investigating soundness and completeness of verification calculi, one should 
gloss over the syntactic details of expressions and assertions. Formalising substi- 
tutions is irrelevant. We will show in Sect. 6.3 that, at least for simple imperative 
programs, in the soundness and completeness proof, one does not need to appeal 
to any property of a substitution function. 

In Sect. 7 we show that the metatheory for verification calculi dealing with 
local variables is more subtle. Not only is it essential to have an adequate sub- 
stitution function (on the level of states), it is also necessary to employ an 
extensional notion of equality. This requires some attention, as type-theoretic 
systems such as Coq and LEGO are tailored to an intensional type theory. The 
case of VDM is similar and not covered in this paper. 



2 The State Space 

The state space records the value of every program variable. Let VAR be the 
type of program variables. In a type-theoretic setting, it seems natural to investi- 
gate multiple sorts. We identify the universe of data types with the universe of all 
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types expressible in LEGO. The type of variables can be declared by providing 
a function 

sort : VAR ^ Type . 

A state a for a type environment sort is a function mapping program variables x 
to values of type sort(a;). The state space itself is therefore a dependent function 
space: 

Definition 1 (State Space). E Wx : VAR • sort(a;) 

We have implemented a substitution operation on dependent functions which 
satisfies the specification 



r w N I i ii X = y, , _ 

a[x^t]{y)^l , ( 1 ) 

lcr(x) otherwise. 

This requires quite sophisticated type theory. See [10] for details. 

Alternatively, one could exploit that only a finite number of a priori deter- 
mined variables X\, . . . ,Xn are usually^ employed in any concrete program. Thus, 
the dependent type space E degenerates into the finite product sort(a;i) x ••• 
X sort(a;n) [11,12]. 

3 Expressions 

Boolean expressions occur in loops and conditional statements. Other types of 
expressions depend on the data types expressible in the language and occur both 
as subexpressions of boolean expressions and in the assignment statement. One 
may define the syntax of expressions by a BNF grammar. 

Example 1 (Syntax of Expression). Homeier & Martin [13] define two classes of 
expressions 



e ::= n I a; I -I— I- a; | ei -I- 62 | Ci — C 2 
6 ::= Cl = 62 I ei < 62 I 61 A 62 I V 62 I ^6 

We will only consider expressions without side-effects^ and do not deal with 
the expression -I— ha;. The semantics can thus be easily fixed denotationally and 
is determined by an interpretation function / and a state a. An interpretation 
determines the value of constants such as 0, -h, A and (free) variables in express- 
ions and logical formulae e.g., |a;](/(cr)) = /((r(a;)). Whenever we come across a 
boolean expression in a loop or a conditional statement, we are only interested 
in the value it evaluates to, true or false. Similarly, in an assignment, we treat 
evaluation of the expression as atomic, merely a value depending on the state 

^ An exception is e.g., Lisp. 

^ Such a strict distinction between expressions and commands is one of the fundamen- 
tal principles underlying idealised Algol [14]. 
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space. We are not interested in syntactic properties such as whether one express- 
ion is a subterm of another expression. Ignoring the syntax of expressions paves 
the way towards a reasonable level of abstraction when investigating properties 
of verification calculi for imperative programs without side-effects in expressions. 

Furthermore, we are only interested in the standard interpretation^ of con- 
stants. Hence, the state space alone determines the semantics. We only consider 
expressions at this semantic level: 

Definition 2 (Expressions — Shallow Embedding). Given an arbitrary 
type T, we represent expressions by 

expression(T : Type) = E ^ T . 

Let e : expression(T) be any expression. Its evaluation depends on a concrete 
snapshot of the state space a : E We define eval((r)(e) =' e(cr). 

A benefit of adopting shallow embedding is that we do not have to worry 
about formalising the syntax in a logical framework. Working on the metatheory, 
one never encounters a concrete expression! Moreover, substitutions are much 
easier to deal with at the semantic level. It can be defined in terms of updating 
states: 

Definition 3 (Updating Expressions Shallow Embedding). 

e[x t] (a) e((T [a; i— >■ t]) 

In a deep embedding, one would need to define an interpretation and substitution 
function by induction on the structure of expressions and prove the substitution 
lemma 

|e [x ^ t]](cr) = |e](cr [a; i-^ t]) . 

An advantage of deep embedding is that, for concrete expressions, substitutions 
are more palatable. Consider the syntactic substitution (x * y) [a; i— > 3]. Due to 
the recursive definition of updating, this should reduce to 3 * y. In the shallow 
embedding, we would instead have 

Act • ^( Act ■ ct(x) * cr(y)J (CT [a; i— > 3])^ (2) 

|a;»yl 

which (/3-)reduces to Act • (ct [a; i— > 3] (a:) * ct [a; 3] (y)). This is equivalent to 

Act • 3 * a{y) (3) 

Unfortunately, the LEGO system offers little support for reducing (2) to (3). In 
concrete examples, this leads to excessively large proof obligations. Computer- 
aided verification becomes unfeasible^. 

® This not only simplifies the encoding, it also avoids the problematic issue of how 
to axiomatise the class of acceptable interpretations. In particular, incompleteness 
results of Hoare Logic e.g., [15], exploit setups with non-standard interpretations. 
In a verification of the recursive algorithm Quicksort we had not manually inter- 
vened in reducing substitutions. For the correctness proof, LEGO had to run for 
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4 Assertions 

Traditionally, assertions are considered to be simply formulae of first-order logic, 
which are interpreted in the usual way, except that the value of variables is de- 
termined by a state. Semantically, from a type-theoretic point of view, assertions 
are the particular class of expressions over propositions i.e., expression(Prop). 
Instead of first-order logic it is convenient to exploit the native logic of the 
theorem prover. This encoding has been adopted in [6,8]. 

Our novel approach to Hoare Logic has been to give a more rigorous treat- 
ment of auxiliary variables. They are required at the level of specifications to 
relate the value of variables in different states as assertions may otherwise only 
relate the value of program variables in a single state. 

At the syntactic level, one would need to (formally) distinguish between 
program variables and auxiliary variables. One could for example enforce that 
program variables have to start with a lower-case letter, whereas auxiliary vari- 
ables must start with an upper-case letter. To be well-formed, programs may 
only refer to program variables. 

Semantically, program variables are, as before, interpreted according to the 
state space. However, auxiliary variables are interpreted freely. Let T be the 
domain of this interpretation. 

Definition 4 (Assertions — Shallow Embedding). 

Assertion(r : Type) (T x A) — > Prop 

Example 2. Let T = {A, T} ^ int. Relative to an interpretation Z : T and a 
state a, we interpret lO<yAx = XAy = T](A, cr) = 0 < a{y)Aa{x) = Z{X)A 
a{y) = Z(Y). 

Due to the shallow embedding we may update assertions analogue to express- 
ions by relaying the work to updating the state space. In practice we only need 
to update the value of program variables but not auxiliary variables. Let p be 
an assertion. 

Definition 5 (Updating Assertions — Shallow Embedding). 

p\x ^ t] {Z, a) = p{Z, cr [x I— > fj) . 

Analogue to expressions, in a deep embedding, one would need to additionally 
represent syntax for assertions, define an interpretation and a syntactic substi- 
tution function. Then, one would need to prove the substitution lemma 

lp[x^t]j{Z,a) = lp]{Z,a[x^t]) . 

more than 37 hours requiring more than 80MB on a SUN SPARC station 20 with 
sufficient physical memory to avoid swapping. In comparison, on the same architec- 
ture, the completeness proof for Hoare Logic dealing with recursive procedures and 
local variables could be dealt with in less than 15 minutes requiring less than 25MB. 
In both cases, we started LEGO in the empty environment. 
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5 Imperative Programs 



A shallow embedding of a non-trivial imperative programs is problematic for 
a strongly- typed proof system such as LEGO. An imperative program S may 
not terminate. Hence, the denotation of S' is a partial functions on the state 
space, |S] : S ^ S. However, LEGO only supports total functions. To avoid 
partiality, one may move to relational denotational semantics |S] : (A x A) 
— > bool, see [6] for an example. 

In any case, to formally prove soundness within a logical framework, one 
needs to pursue induction on the structure of programs. Thus, one has to select 
a deep embedding strategy for the imperative programming language. For the 
purpose of this section, we consider a (very) simple imperative programming 
language consisting of assignments and loops. 

Definition 6 (Syntax of Imperative Programs Deep Embedding). Im- 
perative programs S : prog are defined by the BNF grammar S ::= x e \ 
while b do S where x : VAR, e : expression (sort(a;)) and b : expression(bool). 

We employ structural operational semantics which provides a clean way to 
specify the effect of each language constructor in an arbitrary state. It relates a 
program with its initial and final state. 

Definition 7 (Structural Operational Semantics). The operational seman- 
tics is defined as the least relation . ► . C A x prog x A satisfying 



a 



a[x ^ eval((r)(e)] 



( 4 ) 



a 



a 

S 



a 



while b do S 
► a 

while b do S 
while 5 do S' 

► T 



T 



provided evader) (6) = false . 
provided eval(cr)(6) = true . 



Intuitively, cr ► r denotes that the program S when invoked in the 

state (T will terminate in the state r. 



6 Semantics and Derivability of Hoare Logic 

Hoare Logic is a verification calculus for deriving correctness formulae of the 
form {p} S {g} for assertions p, q and programs S. We consider total correct- 
ness. Intuitively {p} S {g} specifies that, provides S is executed in a state such 
that the precondition p holds, it terminates in a state t where the postcon- 
dition g is satisfied. One distinguishes between the semantics of a correctness 
formulae )=Hoare {p} S' {g} (which formalises the above intuition) and the no- 
tion of deriving a correctness formulae hnoare {p} S {g} (which is employed in 
order to verify concrete programs) [16]. 
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Definition 8 (Semantics of Hoare Logic). Parametrised by an arbitrary 
type T, let ^Hoare {■}■{■} C Assertion(T) x prog x Assertion(T) be a new 
judgement defined in terms of the operational semantics 

l=Hoare {p} S {g} = MZ -Mcr ■ p{Z, a) -a ► r A q{Z, r) . 

Based on work of Floyd [17], Hoare [3] proposed a verification calculus for partial 
correctness, now referred to as Hoare Logic. For every constructor of the imper- 
ative programming language, Hoare Logic provides a rule which allows one to 
decompose a program. The precondition of the assignment axiom 

{p [a; 1 -^ e]} x-.— e {p} 

is, at least for simple imperative programs, the sole reason for having to bother 
about updating assertions! 

Programs mentioned in the premisses are strict subprograms of the programs 
mentioned in the conclusions. Unlike the operational semantics, this also holds 
for loops. 



{pAb} S {p} 

{p} while 5 do S' {p A ^6} 



( 5 ) 



One also needs a structural rule to weaken the precondition and strengthen the 
postcondition is a proof obligation. This is particularly useful when one wants to 
apply the rule for loops as the precondition must remain invariant with respect 
to the body of the loop. 



{Pi} S (gil 

{p} S {g} 



provided p ^ p\ and gi g. 



( 6 ) 



6.1 Total Correctness 

To ensure termination, the rule for loops (5) needs to be modified. We introduce 
a termination measure u : expression(tU) for some well-founded structure (W, <) 
which is decreased whenever the body is executed: 

yt -.W ■ {p Ab Au = t} S {p Au <t} 

{p} while 6 do S {p A ^6} 

A similar rule for verification calculi where postconditions may explicitly refer 
to the value of program variables in the initial state e.g., VDM, has been put 
forward by Manna & Pnueli [18]. Variants of this rule tailored for W = nat [19] 
or W = int [20] have also been published previously. We prefer the well-founded 
version, because it simplifies the completeness proof without any impact on the 
soundness proof [10]. It is well known that in practice, it is often easier to reason 
about termination using well-founded sets rather than being restricted to natural 
numbers [21]. 



Metatheory of Verification Calculi in LEGO 



141 



6.2 Auxiliary Variables 

Furthermore, we have strengthened the rule of consequence (6) so that one may 
adjust auxiliary variables when strengthening preconditions and weakening post- 
conditions. Let X be the list of all program variables and Z the list of all auxiliary 
variables occurring in the assertions p\, qi, p and q. We propose the new rule 

{Pi} S {gi} 

{P} S {g} 

provided VZ - \lx ■ p^ i 3Z\ ■ {pi [Z ^ Zi]) A (Vx • (gi [Z ^ Zi]) ^ g) 



Example 3 (Auxiliary Variables). With this rule (but not Hoare’s (6)), the two 
correctness formulae {X = x} S {X = a;} and {X = a; -I- 1} S {X = a; -I- 1}, 
where all variables denote integer values and X is an auxiliary variable, are 
interderivable. 

The new rule of consequence plays a crucial role in deriving the Most General 
Formula (MGF), the key theorem to establish completeness for Hoare Logic 
dealing with recursive procedures [22,10,23]. 

Definition 9 (Derivability of Hoare Logic Deep Embedding). A veri- 
fication calculus for Hoare Logic is defined as the least relation 

bnoare {■} ■ {■} C Assertion(T) x prog x Assertion(T) 

indexed by an arbitrary type T such that 

l-Hoare {A(2', (t) • p{Z, a [x ^ eval(cr) (e)j) } a; := e {p} (7) 



Vt : W ■ hnoare {-^(Z, O') ■ p(Z, a) A eval((r)(6) = true A eval((r)(u) = t} 

S 

{A(Z, t) • p{Z, t) a eval(r)(M) < t} 
bnoare {p} while 6 do 5' {X{Z, r) ■ p(Z, r) A eval(r)(6) = false} 

where {W, <) is well-founded. 



bnoare {Pl} S }gi} 

l~Hoare M ^ {?} 

provided VZ ■ Vcr • p{Z, a) ^ • pi{Zi, a) A (Vt • gi(Zi, t) q(Z, r))^ . 

( 8 ) 
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6.3 Soundness 

Formally, one needs to show that whenever a correctness formulae Fuoare 
{p} S {g} is derivable, the proposition ^Hoare {p} S {g} holds. Soundness is 
best pursued by induction on the derivation of the correctness formula. For the 
discussion of deep versus shallow embedding, the case of an assignment is of 
particular interest. 

Lemma 1 (Soundness of Assignment Axiom). 

|=Hoare {HZ , a) ■ p{Z, (7 [x ^ eval(cr) (c)]) } x:—e {p} 

Proof: Expanding the definition of ^Hoare, given Z, a, one needs to establish 

X I — 6 

p{Z, <j[x ^ eval(cr)(e)]) => 3r • tr ► r A p{Z, r) . 

The operational semantics uniquely determines the final state t. Appealing to 
the axiom (4), it suffices to show 

p{Z, (j[x ^ eval(cr)(e)]) => p{Z, a[x ^ eval(cr)(e)]) . 



As expected, due to a shallow embedding, we only have one notion of substitution 
(on the level of states). But perhaps surprisingly, soundness holds regardless of 
the details of the actual substitution function® . 

If one is only interested in establishing soundness (and not completeness), 
there is no need for any deep embeddings. Induction on the structure of programs 
is not required. Hence, there is no need for a deep embedding of imperative 
programs e.g., Gordon [6] represents programs by their relational denotational 
semantics. 



A Shallow Embedding of Hoare Logic. Moreover, if one externalises the 
induction of the soundness proof to the meta-level as opposed to the proof tool, 
one can give a shallow embedding for Hoare Logic. Without a notion of derivabil- 
ity as given in Definition 9, soundness can be established by showing that axioms 
are valid with respect to |= Hoare and that all rules preserve soundness. This ap- 
proach has been pursued by Gordon [6], Homeier [7], Homeier & Martin [13] 
and Norrish [24]. 

One must however be clear about the limitations of this approach. For exam- 
ple, Homeier & Martin [13] erroneously claim that the soundness of a (complete) 
Verification Gondition Generator (VGG) has been established by appealing to 
the axioms and rules of an (incomplete) presentation of Hoare Logic®. But since 
they employ a shallow embedding of Hoare Logic, correctness of the VGG has 
instead been established by appealing to the definition of operational semantics. 

® This observations has already been reported in [8]. 

® A consequence rule is missing. Thus, one can e.g. not derive {x = 1} skip {true}. 



Metatheory of Verification Calculi in LEGO 



143 



6.4 Completeness 

In an incomplete formal system, one may only verify a strict subset of all true 
formulae. A naive definition of completeness is bound to fail in the context of 
verification calculi. On the one hand, if the chosen underlying logical language 
is too weak, e.g., pure first-order logic together with the boolean constants false 
and true, some intermediate assertions cannot be expressed. Hence, derivations 
cannot be completed. On the other hand, if the logical language is too strong, 
e.g. Peano Arithmetic, it itself is already incomplete and the verification calculus 
inherits incompleteness. 

To avoid this problem. Cook has proposed that one investigates relative com- 
pleteness in an attempt to separate the reasoning about programs from the rea- 
soning about the underlying logical language [25] . One only considers expressive 
first-order logics. Furthermore, rules of the verification calculus may be applied in 
a derivation if the logical side-condition is valid rather than derivable. In partic- 
ular, completeness no-longer compares a proof-theoretic with a model-theoretic 
account. 

In practice, achieving relative completeness of verification calculi is highly 
desirable. In logic, finding valid formulae which can not be derived is often 
somewhat esoteric. A different story has to be told for the notion of relative 
completeness in verification calculi e.g., in Sokolowski’s calculus [4], it is very 
difficult to come up with any non-contrived correctness formula of a recursive 
procedure which can be derived! 

In a machine-checked development, it is convenient to interpret Cook’s pro- 
posal by employing the native (expresssive) logic of the theorem prover to inter- 
pret assertions. A shallow embedding of assertions automatically blurs the model 
and proof-theoretic aspect of assertions. As an important aspect in the complete- 
ness proof, one needs to be able to formulate an assertion which expresses the 
weakest precondition relative to an arbitrary program and postcondition. With 
a shallow embedding, this is straight-forward: 

Definition 10 (Weakest Precondition Shallow Embedding). 

wp(S', q){Z, a) = 3t ■ a ► r A q{Z, t) . 

With a deep embedding of assertions, one would have to derive a syntactic rep- 
resentation which denotes the weakest precondition. This is considerably more 
challenging^. 

One may prove completeness directly by induction on the structure of S. 
Instead, we follow a technique developed by Gorelick, which, previously, has 
only been applied to the scenario of Hoare Logic dealing with recursive proce- 
dures [27]: 

1. By induction on the structure of an arbitrary program S, one establishes that 
a specific correctness formula MGFHoare(<S') is derivable in the verification 
calculus. 

^ If the assertion language is Peano Arithmetic, this construction is not for the faint- 
hearted as one has to work on the level of Godel numbers [26]. 
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2. Given the assumption |=Hoare {p} S {g}, one may derive hnoare {p} S {g} by 
applying structural rules to hnoare MGFHoare(«5'). All side-conditions which 
arise will be dealt with by the assumption. 

In other words, instead of directly deriving l=Hoare {p} S {<?} ^ 

l-Hoare {p} S {g}, One considers the stronger property hnoare MGFHoare(5') for 
which induction goes through more easily. In particular, the direct proof cannot 
be applied when one considers recursive procedures, because the induction 
hypotheses are not strong enough. 

The proposition Fnoare MGFHoare(S') asserts that, provided that one only 
considers input states in which the program S terminates, one may derive a 
correctness formula in which the postcondition relates all inputs with the ap- 
propriate outputs according to the underlying operational semantics of the pro- 
gramming language. At the semantic level, |=Hoare MGFHoare(5') holds trivially. 

Definition 11 (MGF Shallow Embedding). 

MGFHoare(^) = ^\{Z , u) ■ (7 S {A(Z, t ) ■ Z = t} 

Notice that the precondition is equivalent to the weakest precondition relative 
to the postcondition X{Z,t) ■ Z = t. 

Analogue to the proof of soundness, in deriving the MGF for assignment, one 
again encounters the phenomenon that the details of the substitution function 
are irrelevant. 

7 Extensional Equality and Local Variables 

In the previous section, we have seen that, for soundness (and completeness), 
details of substitutions can be neglected. Gatering for local initialised variables 
new a; : = e in S' is however more demanding, because one needs to reinstate the 
previous value of x after the block. Based on an idea by Sieber [28], Olderog [29] 
captures the semantics of blocks by 

g 

cr [x I— > eval((r)(e)] ► r 

new X e in S 

a ► r [x I— >■ cr(x)] 

To verify programs containing blocks, we have proposed the rule 

\/v ■ {p[x ^ v] /\ X = e[x ^ x]} S {g [x 1 -^- i;]} 

{p\ new X := e in S {g} 

Taking into account a shallow embedding of assertions, this corresponds formally 
to 

Vx- hHoare O') ■ p(Z, a [x v]) A fj(x) = eval((T [x 1 -^ 'f^])(e)} S {g [x i-^- r>]} 
l-Hoare {p} new X := 6 in S {g} 



( 9 ) 
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It is an improvement over Apt’s version [5] in that it deals with initialised blocks. 
Furthermore, no side-conditions are required®. In the soundness and complete- 
ness proof, we need to appeal to the following two extensional properties of 
substitutions: 



a[x o{x)] = a (10) 

a[x ti] [x t 2 ] = a [x ^ 2 ] (11) 

We restrict our attention to the crucial step of the completeness proof: 

Lemma 2 (MGF for Blocks). Whenever one can derive hnoare MGFHoare(5'), 
one may also establish hnoare MGFHoare(new x e in S) . 

Proof: Given an arbitrary v : sort(a;), we apply the (stronger) rule of 

consequence (8) to the hypothesis Fnoare MGFHoare(«5') in order to derive 

f new a: e in S' 

bHoare i A(Z, u) ■ U [x ^ v\ ► Z A a{x) 

s 

{A(Z, t ) ■ Z = t[x ^ ri]} 

From (12), the rule for blocks (9) renders the proof obligation. As a side- 
condition, given states Z and cr such that 

new a; e in S 
cr[x ^ v\ ► Z 

<j{x) = eval(cr [x ?^])(e) 

g 

we have to find a state r such that a ► r and Z = t [x i->- v]. Inverting 

the derivation of (13), there must be such a state r which satisfies 

g 

(j [x ^ v][x ^ eval((T [a; 'c])(e)] ► r (15) 

and Z = r [a; 1 -^ fj [a; 1 -^ '*^](a:)]. Gourtesy of (14), the property (15) can be sim- 

g 

plified to a[x ^ ?;] [a; 1 — > <j{x)] ► r. To complete the proof, one needs to 

appeal to the substitution properties (10) and (11) in order to replace the state 
a [x ^ v\[x ^ <j(x)] by the extensionally equal function a. I 

It follows from the specification of the update operation on states (1) that we 
may derive the extensional counterparts of (10) and (11) 

cr [a; 1-^ cr(a;)] (y) = cr(y) 
cr [a; 1-^ ti] [a; 1-^ ^2] (y) = cr [a; 1-^ 12 ] (y) 

whereas (10) and (11) themselves do not hold for the standard equality concepts 
such as Leibniz or Martin-L6f equality, because they distinguish between inten- 
sionally distinct functions. We therefore need to axiomatise extensionality [30]. 

® Scoping of the implicitly universally quantified p, S and q ensures that v ^ 
free{p,S, q). 



(13) 

(14) 



= eval(cr [x 1 -^. ri])(e) 



( 12 ) 
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8 Conclusions 

To prove completeness, one needs to be able to construct assertions which express 
semantic properties of the programming language. On paper, one usually simply 
assumes that the assertion language is sufficiently expressive. Both, soundness 
and completeness proofs can be simplified if one does not worry about the actual 
syntactic representation of assertions. 

Moreover, a thorough treatment of syntax has the unpleasant side-effect that 
substantial amount of formal detail is required to deal with substitutions at 
the level of states, expressions and assertions. This seems redundant as far as 
metatheory is concerned. Specifically, for simple imperative programs, the proofs 
of soundness and completeness can be conducted irrespective of the chosen sub- 
stitution function. Semantically, the assignment axiom in Hoare Logic simply 
lifts substitutions pointwise from the level of states to predicates on states. 

Syntax does however matter if, instead of metatheory, one wishes to use the 
axioms and rules to verify concrete programs or generate verification conditions. 
With a shallow embedding, assertions are functions mapping states to propo- 
sitions. Not only are they more difficult to comprehend than their syntactic 
counterpart. Without syntactic structure, the proof tool has little guidance on 
how to best reduce substitutions in assertions. Verifying the Quicksort algorithm 
based on a shallow embedding, we found that the resulting proof obligations 
arising from the side-condition of the rule of consequence became too large for 
the LEGO system to efficiently process. Having to deal with dependent types, 
type-checking involves expensive calculations. 

One ought to clarify the objectives of employing a theorem prover. There are 
two orthogonal problems in verifying imperative programs. 

1 . Establishing soundness and completeness for verification calculi is a challeng- 
ing task. Incorrect results based on doing proofs by hand have been published 
in the past. The metatheory relates semantics and derivability. Syntax of as- 
sertions is not an issue. In fact, the whole idea of relative completeness is to 
factor out the issue of semantics versus derivability of assertions. 

2. Verifying concrete programs is a labour-intensive task for which computer- 
aided support is vital. 

We feel a reasonable approach would be to employ a shallow embedding for 
metatheory and a deep embedding for concrete examples. The calculus for ver- 
ifying concrete programs can informally build on the axioms and rules investi- 
gated in the meta-theoretical analysis. Relating the two formalisations centres 
mostly on the issue of how expressive the assertion language is. We are somewhat 
sceptical whether this deserves a machine-checked proof. 

But perhaps, there is an alternative. Today’s proof tools are equipped with a 
powerful native logic e.g., LEGO supports intuitionistic higher-order logic with 
a rich universe of data types [31]. However, this cannot be directly employed for 
a deep embedding because its syntax is not inductively defined at the level of the 
proof system. But one could consider to treat syntax at a more informal level. 
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Specifically, based on a shallow embedding, one could employ parsing and pretty- 
printing of the theorem prover to convert between the internal representation 
and the user interface. Moreover, one could tailor the prover’s tactics engine to 
better deal with substitutions. At the code level of the theorem prover, it is easier 
to implement a suitable substitution function for a particular class of terms. 
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Abstract. In the ECOOP’97 conference, the author of the present pa- 
per investigated a conservative extension, called Obf^., of the first-order 
Object Calculus 06i<; of Abadi and Cardelli, supporting method exten- 
sion in presence of object subsumption. In this paper, we extend that 
work with explicit variance annotations and selftypes. The resulting cal- 
culus, called Obf^., is a proper extension of Obf^.. Moreover it is proved 
to be type sound. 

Categories: Type systems, design and semantics of object-oriented lan- 
guages. 



1 Introduction 

In the last few years, the problem of designing safe and expres- 
sive type-systems for object-based languages (also called prototype- 
based languages) has been widely addressed. The seminal works of 
[US87,CU89,Mic90,Aba94,FHM94,AC96a] share the same object-oriented 
philosophy, where the main entity is the one of object instead of the one of 
class. In those papers, classes can be easily codified by appropriate objects, fol- 
lowing the “classes-as-objects” analogy of Smalltalk-80 [GR83]. In object-based 
languages, objects are modified directly from other objects (the latter called 
prototypes) by adding new methods, or by rewriting old method bodies with 
new ones. A primitive operation of method call is given, to send a message to 
(i.e. invoke a method on) an object. In functional calculi, adding or rewriting a 
method produces a new object that inherits all the properties of the original 
one. 

Another key issue in object-based languages is the one of subsumption, i.e. 
the capability to use an object with a longer (or more refined) interface in every 
context expecting objects with a smaller (or less refined) interface. This feature 
has been showed to be fundamental in object-oriented paradigm, since it allows 
a significant reuse of code. Unfortunately, as clearly stated in [FM94,AC96a], 
adding object subsumption in presence of object extension make the type system 
very often unsound. 
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T. Altenkirch et al. (Eds.): TYPES’98, LNCS 1657, pp. 149—165, 1999. 

© Springer- Verlag Berlin Heidelberg 1999 



150 Luigi Liquori 



As a simple example of this problem, let us suppose to have a diagonal 
point dpoint composed by two fields, x (holds 1) and y (holds self .x). The 
type of this object is [x:nat,y:nat]. If we “hide”, by subsumption, the x field, 
and we add again x with a new value —1 of type int, and we call y on the 
object dpoint, then we lose the subject reduction property, since the evalu- 
ation of dpoint.y, of type nat, yields the value —1 of type int. Other works 
by [FM95,BL95,Reni95,BBDL97,Reni98,RS98], have addressed the issue of in- 
tegrating object subsumption in presence of object extension. 

This paper starts from the Abadi & Cardelli’s (first-order) Object Calculus, 
called 06i<: [AC96b]. We briefly recall its features. 

— it supports “fixed size” objects (no object extension is provided); 

— it supports method override; 

— it supports object subsumption; 

— its type system catches run-time errors such as message-not-understood. 

In [Liq97b], the Obi^. calculus was extended by allowing object extension com- 
patible with object subsumption, by providing a sound static type system and 
a typed equational theory on objects. This (conservative) extension was called 
Ob'^^.. This paper completes the work of [Liq97b] by extending the type system 
of Ob^.^. with selftypes and explicit variance annotations. 

Selftypes has been showed to be fruitful in a development of flexible type- 
systems for object oriented programming languages (e.g. Eiffel [Mey92], Poly- 
TOIL [BSvG95]). Selftypes allow one to give a type to methods that return self 
or an update of self (for instance, a move method of a point object will have 
type int— >self type, where self type refers to the type of self). Adding self- 
types to object-calculi is not only an exercise of style: in fact we can give a type 
to a considerably number of programs that are not typable within the first-order 
fragment of Obf^.. 

Explicit variance annotations, instead, support flexible subtyping, and a di- 
rect protection tool from unwanted “read” or “write” operations. More precisely, 
an explicit variance annotation is a “label” attached to a method name and de- 
fined together with the method body; it could be one of the following: private, 
public, read_only, and write_only. The meaning of explicit variance anno- 
tations is straightforward: they denote the access privileges of fields/methods 
belonging to the object. Having explicit variance annotations inside the calculus 
allows a more disciplined use of methods and fields, and enforces object encap- 
sulation. 

The addition of selftypes fits well into the type system of [Liq97b] , where we 
distinguish between two “kinds” of objects-types, namely the saturated object- 
types, and the diamond object-types. Shortly, if an object can be typed by a 
saturated object-type, then it can receive messages and override the methods 
that it contains. Instead, if an object can be typed by a diamond object-type, 
then it can receive messages, override some methods, and it can be extended by 
new methods. On both types, a subtyping relation is defined. 

The subtyping relation on saturated object-types can be commonly found 
in the literature: at first approximation, an object typed with a “longer” (i.e. 



Bounded Polymorphism for Extensible Objects 151 



with more methods) object-type can be used in any context expecting an object 
typed with a “shorter” (i.e. with less methods) object-type. At this level, object 
extension is forbidden since we can first “hide” , by subsumption, a method m of 
type (7, and then extend the object with the same method m of type t, a being 
incompatible with r. 

For diamond object-types, instead, the subtype relation behaves as follows: it 
is still possible to hide a method, but its type is recorded in the diamond object- 
type. Since object extension is only allowed on objects typed with diamond 
object-types, the hidden methods can be re-added again only with the same 
type. 

The calculus that we present in this paper is a conservative extension 

of the first-order one Obi^.. In summary, our calculus exhibits the following 
features: 

— extendible objects with appropriate method specialization of inherited meth- 
ods, 

— a {mytype- covariant) subtyping relation compatible with object extension, 

— explicit variance annotations; 

— override of explicit variance annotations; 

— static detection of run-time errors, such as message-not-understood. 

This paper is organized as follows: in Section 2 we will present the Extended 
Object Calculus a la Curry (i.e. without type decorations). In Section 3 we will 
introduce the types, decorate our Ob'^^. calculus with types, and present the 
type system. A number of examples which are meant to give an insight of the 
power of Ob'^^. will be provided in Section 4. The last section will be devoted to 
a comparison with the paper of Abadi and Cardelli [AC95] , the paper of Didier 
Remy [Reni98], and the paper of Riecke and Stone [RS98]. Part of this material 
appeared in two technical reports [Liq97a], and [Liq99]. 

Acknowledgement The author is grateful the anonymous referees to their 
helpful comments on this work. 

2 The Extended Primitive Object Calculus 

The untyped syntax of the Extended Object Calculus is defined by the following 
grammar: 

o ::= s I \^iTi = ^(si)°i] \ o.m | o.m := ^^(s)o | o.m := T \ o.m := Tg{s)o 

T ::= private | public | read_only | write_only. 

Here the := operator can be intended as an operator on objects which over- 
rides method m in case this method is already present in the object, otherwise it 
extends the object with m. The grammar for T denotes explicit variance annota- 
tions that are introduced to support a clear form of encapsulation and protection 
from unwanted “read” or “write” operations. The expression o.m := T modifies 
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(i.e. overrides) the explicit variance annotation for m. The explicit variance an- 
notations have the following intuitive meaning: 

— public: methods that have both read/write privilege; 

— read_only: methods that only have read privilege; 

— write_only: methods that only have write privilege; 

— private: methods that do not have read/write privilege, i.e. “encapsulated”. 




Table 1. Small-step Untyped Operational Semantics 

2.1 Small-Step Operational Semantics 

Let o{s} denote an object where the variable s can freely occur, let o{o'} denote 
the substitution of the object o' for every free occurrence of s in o when o{s} 
is present in the same context, and let, for G /, with i ^ j, mi and mj 
be distinct methods. The small-step operational semantics can be given as the 
reflexive, transitive and contextual closure of the reduction relation defined in 
Table 1. Note that the original semantics of [AC96a] was build from the reduction 
rules (1) and (2). As usual, we do not make error conditions explicit. Let 
be the general many-step reduction. We remark that the (Ann) rule overrides 
the explicit variance annotation, leaving the method body unchanged; orthog- 
onally, the (Over) rule modifies the method body, leaving the explicit method 
annotation unchanged. The condition (a), (b), (c) are the following ones: 

(a) = Tj G {public, read_only} 

(b) = Tj G {public, write_only} 

(c) = Tj : Vj, T : V, and Vj <: v. 

The condition (a) allows message selection only for fields/ methods that are pub- 
lic or readable from the outside (i.e. annotated with public, or read_only). 
The condition (b) allows overriding only for flelds/methods that are public or 
writable from the outside (i.e. annotated with public, or write_only). The con- 
dition (c) can be explained as follows. A variance annotation (or variance type v) 
can be assigned to an explicit variance annotation (T) via a simple “type” sys- 
tem proving judgments of the shape T : v, where v G {'*',“ ,° ,* }. The type rules 
are: 

public : ° private : * read_only : ^ write_only : “. 
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a,T ::= 




t, u 


type- variables 


(Jj 


the biggest type 


obj t.\miVi : (Ti{t}\ 


saturated object-type, distinct 


obj t.[miVi : ai{t} omjVj 


diamond object-type, Vj £ }, IC\J = % 



Table 2. Syntax of Types 

Given that, the (c) condition assures that the new explicit variance annotation T 
will override the original one Tj only if their variance types are compatible. 
Compatibility is assured by a partial order relation (<:) on variance types, 
given by the following “chains” : 

“<: + <: *, and ° <: " <: V 

As a remark, we observe that we could, in principle, build a simpler and more 
liberal small-step semantics by dropping the side conditions (a), (6), and (c). 
The type system always guarantees the soundness of well-typed expressions. 

For the small-step operational semantics, we can derive an untyped equa- 
tional theory (whose judgment is h o = o') from the reduction rules, by simply 
adding rules for symmetry, transitivity and congruence, and reformulating the 
reduction rules as equalities. We can also define quite simply a big-step opera- 
tional semantics that also induces a “lazy” strategy of evaluation, via a natural 
proof deduction system a la Plotkin. This semantics maps every closed expression 
into a normal form, i.e. an irreducible term (for a presentation of the big-step 
semantics and of the equational theory see [Liq99]). 

3 The Type System 

In the Ob^^. type system, the set of legal types is defined by the grammar of Ta- 
ble 2. The type-constant uj is the supertype of every type. We omit how to encode 
basic data-types which can be treated as in [AC96a]. The bound type- variable t 
can (freely) occur in the ai,aj's, and it is constrained to be covariant. As ex- 
plained in many papers, (among others [Cas95,Cas96,BCC-|-96,AC96a,Liq98]) 
the covariance of self type is necessary if we want to have a statically typed 
calculus with subtyping. As such, binary methods (i.e. methods that receive as 
input an argument of the same type of self) are lost. When a method (j G /) 
is invoked, the result will have a type crj{t} in which every free occurrence of t 
is replaced with the type r of the receiver of the message, i.e. crj{r}, therefore 
showing the “recursive” nature of that type. 

Explicit Variance Annotations. As we have sketched in the previous sec- 
tion, each Vi, vj inside object-types is a variance annotation, i.e. one of the sym- 
bols “, °, or *, standing, respectively, for covariance, contravariance, public- 
invariance, and private-invariance. Any omitted t>’s are taken to be equal to °. 
Covariant methods allow covariant subtyping, but prevent update 
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(see [FM94,AC96a]). Symmetrically, contravariant methods allow contravariant 
subtyping, but prevent invocation. Public-invariant methods, instead, can be in- 
voked and updated. By subtyping, public-invariant methods can be regarded as 
either covariant or contravariant. Private-invariant methods cannot be invoked 
nor updated: these methods are typically introduced (and hence type-checked) 
being public, or readable, or writable, but are later “sealed” (implicitly via sub- 
typing, or explicitly via annotation override) as private methods that cannot 
be accessed nor updated from the outside. The “compatibility” relation between 
variance annotations is depicted below (where v—^v' means v <: v' , i.e. a method 
annotated with v can be also annotated with u'), together with all possible forms 
of protection from the outside of the object performed by variance annotations. 




Saturated-types. The saturated-types objt.[mitii : are the ordinary 

object-types of [AC96a]; shortly, objects assigned to saturated-types can receive 
messages and can be rewritten. 

Diamond-types. The diamond-types ohjt.[miVi : ai{t}omjVj : crj{t}]j^j are 
directly derived from the one of [Liq97b]. Diamond-types can be assigned to 
objects which can be extended and overridden. The symbol o distinguishes the 
two parts of that object-type, i.e. the interface-part and the subsumption-part] 
the former part describes all methods (with their types) that may he invoked 
(if not private or write-only), the latter conveys, instead, information about 
(the types of) methods that are subsumed in the type-checking phase. When a 
method is subsumed in a diamond-type it simply moves from the interface-part 
to the subsumption-part. This “shift” guarantees that any future addition of that 
method will be type-consistent with the previous one. The subsumption-part is 
also used as a infinite “container” of unused method types; this is important 
when we need to add a “fresh” method, in order to not loose the full flexibility 
of rapid prototyping. The shifting and the stocking of methods are performed 
using a suitable subtype system, presented in the Appendix. 

Variance annotations are elegantly integrated within object-types. Since a 
method can also “migrate” from the subsumption-part to the interface-part by 
object extension, and since subsumed methods cannot be invoked, it follows that 
the occurrence of mu : <t in the subsumption-part of a diamond-type is allowed 
only if u S {°,“ }, i.e. for public or write-only methods (an object extension of 
a previously subsumed method behaves, operationally, as an object override) . 

3.1 Types and Judgments 

The judgments we set about to prove have the forms: 

r h ok, r \- (7, r \- o : a, r \- a <-.T, r \- v(t<\vt, 
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where P is a context which gives meaning to the free variables of o, cr, and r, 
generated by the grammar: F ::= e | 7^, s : cr | 7^, u <: cr. In contexts, we often 
write s : u<:a, to denote u<-.a,s : u. By deriving the first two judgments we 
check the well-formation of the context F and of the type cr, respectively; while 
with the third one, we assign a type cr to the expression o. The last two judgments 
are the usual subtyping judgments between types (with variance annotations) 
of [AC96a]. As shown in Section 2, in order to override an explicit method 
annotation, we need the auxiliary judgment T : v, that assigns a variance type v 
to an explicit variance annotation v. 

Cova/Contravariance. Formally, cr{t+} stands for a type where the type- 
variable t occurs only covariantly. Intuitively, cr{u+} means that u occurs at 
most positively in cr; similarly, a{u~} means that u occurs at most negatively 
in cr. The formal definition of covariance follows in Table 3. 



Covariance 

t{u+} 


always 






always 


if Vi =^, then ai{u'^} 


obj : Cilt}] 


if t = u or for all i £ 7: < 


if Vi =~ , then ai{u~} 
if Vi =°, then u ^ FV{ai 






if Vi =’ , always 


Contravariance 

t{u~} 


if t 7^ u 




u>{u~} 


always 


if Vi =^, then cri{M~} 


obj t.[miVi : 


if t = u or for all i £ 7: < 


if Vi =~ , then ai{u~^} 
if Vi =°, then u ^ FV{ai 
^ if Ui =*, always 


Private/Public Invariance 
cr{u*} 


if cr{u+} or ct{m“} 




a{u°} 


if neither a{u'^} nor ct{m } nor o-{u*} 


Variance & o-types 

obj t.[miVi : ai{t}o 








if obj t.[miVi : (Ji{t}] {u^} and 


objt.[mjt>j : 





Table 3. Variance Occurrences 

The type rules for well-formed contexts and types are routine, and can be 
found in Appendix. We only remark that in the (T—o) rule, we require that, for 
all j G J, the type annotations Vj, must belong to {°,“ }, so allowing a method 
to be “writable” . 

3.2 Subtyping 

The more important subtyping rules are presented in Table 4; the full set can 
be found in Appendix. The subtyping rules that deal with diamond-types and 
variance types are the same as in [Liq97b], and [AC95], respectively (see Ap- 
pendix). Moreover we need some extra rules, for instance the rules {S—Var^) 
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(S—Var^) 

r, u <: obj t.\aviVi : ai{t} omjVj : o-i{i}]j|j L Vk(Jk{u} <■ WfeCr^{ii} 'i k £ I ^ J 
r h obj t.[miVi : ai{t}0mjVj : <: obj t.[miv'i : cr'{t} Onijw' : <r'{t}]j|j 



(S-Var) 

r, u <■. obj t.[TRiVi : h VkCTk{u} <: v'k(j'k{u} 

r h obj <: obj : cr'{t}] 

rho- we{°,*} r'ra w €{+,-} 

(S—Invi) 

r h va<: va F h va <: *<t 



M k£l 

iei 

{S—Inv2) 



Table 4. Some Subtyping Rules 

and (S—Var) to deal with variance types for object-types of the same length, 
and the rule {S—Inv 2 ) to say that a read-only or write-only component can be 
regarded as a private one. The rule (S—Invi) is simply a reformulation of reflex- 
ivity. As a side remark, observe that the condition V fc G / U J in rule {S—Varo) 
allows to apply this rule also in the subsumption-part of the diamond- type. This 
condition is more liberal than the simpler V /c € /, since it allows one to re-add 
a forgotten method with a type different from the one we have forgotten (in 
accordance to its variance type), without losing type soundness. 



3.3 Type Rules 

We decorate our Extended Object Calculus with types as follows: 

o ::= s I [miTi = <: Ti)oi] | o.m | o.m := c(s:m <: t)o \ 

o.m := T I o.m := Tc(s:u<:r)o. 

The c-binder scopes over the object- variable s, referring to self, and the type- 
variable u, referring to the type of self (i.e. self type). The method bodies 
could be intended, in the E<; jargon, as the polymorphic lambda abstrac- 
tion Au<:(Ti.Xs:u.Oi. We analyze in detail the most important type rules of 
(presented in Table 5); see Appendix for the full set of rules. 

[{V—Sel)] This rule gives a type for a message send; in order for a message 
send to be type correct, the host object o must contain the method name 
in its type. Moreover, the substitution of t with r reflects the recursive nature 
of object- types. The host object o can also be an object- variable s: in this case 
the type t will be a type- variable u. Method selection is permitted only on 
public-invariant or covariant components. 

\{V—Over)] This rule overrides the method mj, provided that belongs to 
the interface of the object o, (i.e. k G /), and that the new body for uses 
the methods already present in o; this last condition is ensured by the second 
subtyping judgment of the premises, and corresponds to say that those methods 
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are present in the interface-part of the type r. Object override is allowed only on 
public-invariant or contravariant components. We also observe that r can also be 
a type- variable, and, as such, method override is allowed inside method bodies. 

[(V—Anni)] This rule overrides the explicit variance annotation for 
method (already present in the type of o), only if the new annotation v has a 
variance type compatible with the variance type of present in the object-type 
assigned to o. In this rule, the type of the object o is a saturated-type but can 
be a diamond-type as well, as in rule {V—Ann2). The second premise guarantees 
the presence of method m and the compatibility of its variance type with the new 
one. 

\{V—Ext)] This rule extends an object o with a method m^. Firstly, one can 
see that we cannot extend an object whose object-type is saturated. Secondly, 
this rule extends an object with a new (fresh) method if and only if that method 
is present in the subsumption-part of the diamond-type assigned to the object to 
be extended. But this condition can always be satisfied by a diamond-type thanks 
to the subtyping rule (S—Exto)- Of course we have T : Vk- The condition H I 
guarantees that the methods which are essential to type the body o' are already 
present in the interface-part of the type obj : ai{t} omjVj : crj{t}]^^j. 

Note that this rule can also be applied when the method belongs to o but has 
been already subsumed via an application of a subtyping rule (S—Shift(,). In 
this case, operationally, is a method override. Moreover observe that, since object 
extension modifies from the outside the object, it follows that we can extend 
an object only with public or write only components. In fact, by looking at the 
subtyping rules, we can see that all variance annotations inside the subsumption- 
part are public-invariant or contravariant. 

As minor remarks on object extension, observe that: 

— a “self-extension” operation is forbidden inside method bodies: in other 
words, the object o = [m = c(s)s.n := c(s)l], where n does not belong to 
o, cannot be type-decorated, because we are not able to give any correct 
type for the method m. 

— inside method bodies, the c-bound variables Si (referring to self) in the same 
object o have different bound object-types. As an example consider the ob- 
ject [m = <;(s:u <: [m:int])l, n = c(s':u <: [m:int, n:int])s'.m]of type[m:int, n:int]. 
This fits well with the semantics of the message send thanks to the presence 
of the subtyping rule {S— Width). 

— if we override the method n of o' with a new body (e.g. c(s:u<: [n:mt])l), 
the new bound for u in n does not need to be related with the older one; 
this is sound because the bound depends on the methods useful to type the 
new body. 

— thanks to our sophisticated subtyping system we are not obliged to know 
“a priori” (in advance) all the future extensions of an object; in fact, the 
saturated-part of a diamond-type can always be filled with fresh methods 
thanks to the rule (S—Exto). 

The type system enjoy the subject reduction property. 
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r h o : r r T<:ohjt.[mkVk ■■ CTk{t}] t>fce{°,+ } 







r h o.mfc : o-fclr} 




P h o : r 


P 


h r <: obj t.[miVi : cri{t}] k £ 1 


P, Sfc : u <: obj t.[ 


miVi : b o' : (Tfc{u} Vk 




r h 


O.IIlfc . — 


- (;{sk-u <: obj '^^)o' 


: T 


P h o : 


obj !.[m 


iVi : Gift}] 




P h ob; 


i t.[miVi 


: cTi{!}] <: obj !.[mfcU : ak{t}] 


T : V 



r h o.mfc := T : ohjt.[miVi : ai{t},mkV : 



(V-Sel) 



(V—Over) 



(V—Anni) 



(Let Tfc = objt.[mhUh : 

r o : obj : ai{t} omjVj : Cj{t}]j|j k £ J 

r,Sk ■■ u <: Tfc h o' : (7k{u} T : Vk H (£ I 

(V-Ext) 

r h o.mfc := Tq{sk-u <: Tfc)o' : obj t.[miVi : ai{t}omjVj : o-j{i}]j‘|j\{fc} 



Table 5. Some Term Typing Judgments 

Theorem 1 (Subject Reduction for Obt<..)- 
If r \- o : a and o ™ o', then T h o' : cr. 



4 Applications 

In this section, we present a number of examples that help to illustrate the 
features of Ob'^^.. Any unspecified v and v are taken to be equal to public 
and ° respectively. 

Method Specialization. The following extendible point 

point = [x = c(s:m <: CTi)l, plusl = (;{s:u <: (T2)s.x := ?(s':m' <: (Ti)s.x + 1 ], 

is typable with obj pluslit o], being ai = [x:int], and (T2=obj plusl:!]. 

Subtyping. Let point be as before, and let c_point be obtained by extending 
point with a col field. By an inspection of the typing rules for Obf^. we derive 
h point : Po, and h c_point : CPo, where 

P = obj plusl:t] CP = obj t.[x:mt, col:coZors, plusl:t] 

Po = obj t.[x:mt, plusl:t o] CP* = obj t.[x:mt, col:coZors, plusl:t o]. 
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Now consider the following programs and related (derivable) types, where we 
introduce A-binders to denote functions: 

/i = A(s:P)s.x 

/2 = A(s:P)s.x := <: [x:znt])2 

/a = A(s:Po)s-col := <: [col:coZors])red 

Again, by inspecting the typing rules, we find that the following judgments are 
derivable: 



h 


fi (point) 


: int 


h 


/i (c.point) 


: int 


h 


/2 (point) 


: P 


h 


/2 (c.point) 


: P 


h 


/a (point) 


■ CPo 




/a (c.point) 


■CP^) 



The last judgment is correctly false since CP* 

Method Annotations for Encapsulation. Consider an object p with a field x 
and two methods, namely set and get, invokable from the outside which, re- 
spectively, return and modify the value of x. It is natural to give the following 
saturated- type to p: 

Point = objt.[x° : int,get° : int, set° : inf— >t]. 

Then, in order to make the local field x protected against external access, and 
the get and set methods not writable, we could override p as follow: 

prot_p * ((p-x := private). get := read_only).set := read_only, 

of type 

ProtPoint = obj t.[x* : int, get^ : int, set*^ : int^t], 

being that Point <■. ProtPoint. So, the x variable becomes protected from the 
outside, and the get and set methods can be only invoked but not updated. 
As such, we obtain a neat distinction between public messages (i.e. the interface 
visible outside the object) and private variables (i.e. variables or local methods 
not accessible from the outside) . 

Classes as Collection of Pre-methods. In [Liq97b] a first-order encoding 
of classes-as-objects was given. As the Ob'^^. is an extension of [Liq97b], it 
clearly follows that it also permit the building of classes and class instances. 
However, other encoding of classes are possible, provided that we increase our 
Obf<. with polymorphic types. By polymorphic types we are able to build 
classes as a collection of parametric pre-methods^. A “pre-methods” is a poly- 
morphic procedure that can be later used to construct a method parametric 
in the type of self. As an example, let the following object mem = [get = 
c(s)true, set = c(s)A(6)s.get : =<^(s')6] of type Mem = obj t. [get : bool, set : 

^ If one want to play with Obf^., one may add polymorphic types and type abstrac- 
tion/application, following Section 4 of [AC95[. 



P^P 

P^^CP^. 
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bool^t], and consider the “class” memClass of [AC95] (for the sake of simplic- 
ity, all type-decorations are omitted, and A( ) stands for polymorphic type- 
abstraction) 

memClass = [new = ?(s)[get = <j(s^)s.pre-get( )(s^), 

set = c(s')s.pre-set( )(s')], 
pre-get = <l(s)A( )A(s')f alse 
pre-set = <i(s)A( )A(s')A(6)shget := c(s")6], 

of type Class{Mem) = [new : Mem, pre-get:V(u <: MeTO)u^6oo/, pre-set: 
y(u <■. Mem)u^bool^u\. The pre-get and pre-set methods of memClass are 
parametric pre-methods that do not use the self of memClass; they are used 
inside the bodies of get and set of the class instances generated by the new 
method of memClass. An instance mem of memClass will be generated by sending 
the message new to the class, i.e.: mem = memClass .new : Mem. More generally, if 
a class instance can be typed with Type = obj t.[miVi : then the type 

of the class whose instances can be typed with Type is Class{Type) = [new : 
Type, pre-m^ : \/{u <:Type)u—>(Ji{u}]'^^^ As an interesting remark, we note that 
the type of class instances is a diamond-type: as such, all class instances can be 
dynamically extended by new methods (in pure prototype-based style). 

Modelling Inheritance. Given an object-type Type' (we consider a diamond- 
type, but we can consider a saturated-type as well) of the shape ohjt.[miVi : 
Ti{t}o] and a class type ClassiType') = [newiType', pre-m^ : 

y{u<-.Type')u^Ti{u}Y^^^'^ we can say that forall i G I, a pre-method pre-m^ 
is inheritable from Class{Type) to Class{Type') if and only if u<\Type' im- 
plies Ui{u\ <-.Ti{u\. As in [AC95], the above condition hold for invariant and 
contravariant components, but not necessarily for covariant components. We 
overcome this restriction on covariant components using object extension. A 
detailed treatment of inheritance can be found in [AC95]. 

5 Related Work 

This section is devoted to a comparison between some interesting and related 
works appeared in the literature in the last few years. 

[Rem98] A calculus very close to is the one of Didier Remy. In this 

calculus, objects have the shape C(Xj = “^( 51 )°*] where C is a binder 
for types, r denotes the type of the whole object, i.e selftype, x is a type- 
variable that also denotes selftype (being that in the type rules Si:y), m^ are 
the methods contained in the object with relative bodies c(si)°i- 

Let o = C(Xi ''■)[“* = Aiso in the calculus of Remy, it is pos- 

sible to extend objects with new methods; when we extend an object with a 
method m (in our notation o.m := c(s:m<:t)o') this reduces to 

C(Xj't<— 'rO[™i=S^('Si)°ii™=^(s)°|.*^'^ where t ' is the type of self in the body of m. 
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and r <— r’ is the new type of self obtained by suitable type reduction rules, 
necessary to maintains programs both well-formed and well-typed (but the op- 
erational semantics is still not type-driven) . While there are similarities with our 
proposal and the one of [Reni98] - notably the use of subtyping for dealing with 
object extension - the two calculi have some fundamental differences: 

— in [Rcni98] after an object update, the type of self must be “recompiled” 
using the <— function, since the type of self is factorised by all methods; 
this is not the case in Obf^. because of a “redundancy” of type annotations 
inside method bodies; 

— the [Reni98] calculus have the, so called, virtual methods (absent in Obf^.)-, 

— the Ob'^^. calculus have override of explicit annotations (absent in [Reni98]); 

— variance annotations are the same in both calculi, but private-invariant an- 
notation is absent in [Reni98]. 

— in Obt^., we distinguish between two shape of objects, namely extendible 
objects, and “fixed-size” objects, while in [Rem98] all object are taken to be 
extendible; 

— in [Rem98], object-types are interpreted as total functions from method la- 
bels to types, while in Ob'^^. we rely on the more conventional interpretation 
of object- types as partial functions. 

[RS98] The paper of Riecke and Stone describes a functional Object Calculus 
a la Abadi and Cardelli that allows unrestricted object extension in presence of 
object subsumption. The novelty of this paper is that we can forget a method 
with type cr and later re-add it with a type r incompatible with a. This can be 
done by distinguish “external” method names by “internal” ones. A proper “dic- 
tionary” is attached to each object in order to “link” external labels to internal 
labels. Private fields can be hidden from the outside by subsumption. 

One of the novelty of this paper is the operational semantics that at each step 
manipulates method dictionaries. This manipulation has a run-time cost that can 
slowly the running of the program, although some optimization techniques are 
proposed by the authors. Moreover the style of programming induced by adding 
dictionaries has an impact on the style of programming, since after a while of 
extensions and subsumptions steps one must reconstruct the correct behaviour 
of some methods. 

[AC95] This paper is the “father” of the present paper; many of the ideas present 
in this paper have stimulated our development. The Imperative Object Calculus 
is to our knowledge the first object calculus with an imperative semantics, a 
sound type system with selftypes, subtyping and variance annotations. 
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A The Extended Object Calculus 



Well-formed Contexts 

-T h cr s ^ dom{r) 

(C-£) 

s \- ok r, s : a \- ok 

Well-formed Types 

r, t <-.Lj \- ai{t^} 'ii £ I 7nJ = 0 
r,t<-.oj h Vj e J Vj £ } 

r h obj t.[miVi : Oi{t}omjVj : 

r,t <-. 0 J \- ai{t^} y i £ I 

(T-Sat) 

r h obj : CTi{t}] 



r a t ^ dom{r) 

(C-s) (C-t) 

r,t<: a h ok 



r,t <■. a, r' h ok 

(T-o) (T-Var) 

r,t <: a, r' h t 

r ok 

(T-f2) 

r\- io 



Subtyping Judgments with Variance Annotations 

Tho-Ccr' ue{°+} rha'Ca V G C," } 



{S—Cova) 



r h va <: 



r h va <: a' 



{S— Contra) 



r'ra VG{°,*} Cha V G {+," } 

(S—Invi) (S—Inv2) 

r h va <: va F h va <: *cr 

Standard Subtyping Judgments 

r I- o- r\-a<-.T r\-T<-.p r\-a 

(S-Refl) (S-Trans) (S-C) 

r \- a <: a F \~ a <: p F \~ a <: uj 



Subtyping Judgments for Object-Types 

{S-Var^) 

T, M <: obj : ai{t} amjVj : aj{t}]j^j h Vkak{u} <:v).al.{u} V k£lUj 

F h obj t.\miVi : ai{t}0mjVj : cTj{t}]j|j <: obj : cr'{t} bmjv' : cr'{t}]j|j 

F,u<\ obj : ai{t}] h w*<Tfe{u}<: \/k£l 

(S-Var) 

F h obj t.[miVi : ai{t}] <: obj 
(S-ShifU) 

F \- ohjt.[miVi ■. ai{t} OTRjVj ■. aj{t}].2j'^^ VfcG{°,“} 'ik£K 



F h obj : ai{t}<>mjVj : aj{t}]^^j^^ <: obj t.[miVi : ai{t}<>mjVj : 

[S—Exto) 

F \- db)t.\miVi \ ai{t} amjVj \ aj{t}Y-fj^jj^ WfcG{°,“} VfcGA 



F h obj : CTi{t} <: obj : CTi{t} 
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r h obj : Gi{t} OTRjVj : 

(S-SaU) 

r h obj : ai{t}omjVj : <: obj : Gi{t}] 

r h objt.[miUi : (7i{t}] 

(S-Width) 

r h obj : Gi{t}] : Gi{t}\ 

Type Rules for Objects 



r,s : G,r' ok ri-o:o- rhaCr 

(V—Proj) 

r, s G, r' \- s : G r \- o : T 

r o :t r h t <: obj t.[mfcVfc : Gh{t}] Vk £ } 

r h o.mfe : Gk{r} 

(Let n = obj : Gh{t}]^^^'^^''^). 

r,Si ■. u<\Ti h Oi : Gi{u} Hi Q I Ti : Vi 'ii £ I 

r h \snTi = q^Si'.u <: Ti)oi] : obj : Gi o\ 



(V-Sub) 

{V-Sel) 



(V-Obj) 



r o ■. T r T <: obj t.[miVi : cri{t}] k £ I 

r,Sk ■■ u <: obj t.[miVi : Gi{t}] h o' : Gk{u} Vk £{“,“} 

iV—Over) 

r h o.mfc := c;{sk-u <: obj t.\niiVi : Gi{t}\ *^^)o' : r 



r h o : obj t.[m.iVi : Gi{t}] 

r h obj t.[miVi : Gi{t}] <: obj t.[mkV : crfc{t}] T : v 

(V—Anni) 

r h o.mfc := T : objt.[miUi : Gi{t},mkV : CTfc{t}]*^^^^'‘^ 

r h o : obj t.[miVi : Gi{t} omjVj : 

r h obj : Gi{t} GmjVj : Gj{t}]^^j <: obj t.[mfcti : Gk{t}] T : v 

(V—Ann2) 

r h o.mfc := T : obj t.[mit>i : (Ti{t},mfcV : Gk{t}GmjVj : 

(Let Tfc = obj t.[^hVh : crh{t}]'“^-^^^'‘^). 

F h o : obj t.[miWi : Gi{t} GmjVj : Gj{t}]j^j k £ J 
r,Sk-u <: Tfc h o' : Gk{u} T \ Vk H (- I 

iV-Ext) 

r h o.mfc := T?(sfc:w <: Tfc)o' : obj t.[mitii : Gi{t}omjVj : cTj{t}]j‘|j\{fc} 
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Abstract. We extend Martin-Lof’s constructive set theory with effec- 
tive quotient sets and the rule of uniqueness of propositional equality 
proofs. We prove that in the presence of at least two universes Uo and U\ 
the principle of excluded middle holds for small sets. The key point is 
the combination of uniqueness of propositional equality proofs with the 
effectiveness condition that allows us to recover information on the equiv- 
alence relation from the equality on the quotient set. 



1 Introduction 

Within the framework of Martin-Lof’s Intuitionistic Type 
Theory [Mar84,NPS90], in order to generate some formal topologies [Sani87], the 
quotient sets are also desirable [NV97]. But some care is necessary in extending 
Martin-Lof’s set theory with quotient sets if we want to keep constructivity. 

Here, we consider the extension of intensional Martin-Lof’s set 
theory {MLTT) with quotient sets as formulated in [Hof95] and we want to 
explore the possibility to make quotients effective. Intuitively, effectiveness for 
quotient sets means that if two elements of a set are in the same “equivalence 
class” as represented by an element of the quotient set, then the two elements 
satisfy the equivalence relation. A property with this name can be found in cat- 
egory theory as referred to an equivalence relation (see e.g. [MR77]). The usual 
constructions of quotients in classical set theory, in categorical universes like 
toposes and in the setoids made out of type theory enjoy this property. 

In this paper we give an answer to the question of extending MLTT with 
effective quotients, if we also add the rule of uniqueness of equality proofs [Hof95]. 
Indeed, even if the rule of uniqueness of equality proofs is not provable in the 
intensional version of Martin-Lof’s set theory as proved by M. Hofmann and T. 
Stretcher [HS95] , however it is definable by pattern-matching [Coq92] , which is 
a very useful tool for implementations of type theory. 

To formulate effectiveness we need to pass to the extension of Martin-Lof’s 
type theory, here called iTT, augmented with the true judge- 
ment A true (see [Mar84,Val95]). According to the paradigm in [Val95], the 

* This work has been done at the Department of Mathematics, University of Padova, 
Italy 



T. Altenkirch et al. (Eds.): TYPES’98, LNCS 1657, pp. 164—178, 1999. 
© Springer- Verlag Berlin Heidelberg 1999 



About Effective Quotients in Constructive Type Theory 



165 



rules of iTT about true judgements, called admissible, are exactly those ob- 
tained from MLTT such that we can prove at the level of the metalanguage the 
following preservation property: the judgement “A true” is derivable in iTT iff 
there is a proof-term “a” such that “a £ A” is derivable in MLTT. 

We call iTT^ the extension with true judgements corresponding to MLTT'^, 
that is MLTT augmented with intensional quotients and the uniqueness of equal- 
ity proofs. Now, in iTT^ we express the effectiveness condition in terms of true 
judgements as follows 

a G A b G A \d{A/R, [a], [5]) true 
R{a, b) true 

and we call iTT^^ the extension of iTT^ with this condition. We add effective- 
ness as a condition on true judgements, because we are not able to think of a 
constructive type theory with only the four kinds of judgements 

A set A = B a G A a = b G A 

that extends MLTT^ and whose extension with true judgements makes the 
effectiveness condition admissible. Indeed, in order to admit the effectiveness 
condition in the corresponding extension with true judgements, this claimed 
type theory should allow to derive the following rule 

ugA bGA p € ld(A/i?, [a], [6]) 

6lT 

?(a, 6,p) € i?(a, b) 

for some proof-term ?(a, b,p). 

preservative 

MLTT > > iTT 

preservative 

MLTT^ > ^ 

preservative 

MLTT^+ efF(?) > ^ efF(?) 

Actually, we will show here that we can not have such a theory where eff can 
be derived, since even in the extension iTT^^ the principle of excluded middle 
holds for small sets. Indeed, in the presence of quotient sets with the effective- 
ness condition, the rule of uniqueness of propositional equality proofs and at 
least two universes Uq and U\, to which the codes of quotient sets are added, 
we can reproduce for small sets the proof of Diaconescu [Dia75] made within 
topos theory that the axiom of choice implies the principle of excluded middle. 
Therefore, to be clearer, if a constructed type theory including MLTT^ eff 
existed, then its preservative extension with true judgements would admit the 
effectiveness condition. Hence, as shown here, we would be able to prove the 
principle of excluded middle for small sets at the level of true judgements and as 
a consequence of the preservation property in the pure type theory itself against 
its claimed constructivity. 
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In the framework of set theory the proof reproduced here shows the incom- 
patibility, from the intuitionistic point of view, between the extensionality axiom 
and the axiom of choice. In the framework of topos theory it makes use of ex- 
tensional powersets. Here, we will see that to reproduce extensionality in the 
context of intensional type theory, where the axiom of choice holds because of 
the presence of a strong existential quantifier, it is sufficient to have general ef- 
fective quotients to be used on the first two universes and the rule of uniqueness 
of equality proofs at the propositional level. In fact, in the proof we mimic pow- 
ersets by quotienting the first two universes under the relation of equiprovability. 
Then, we need the effectiveness condition to decode the extensional equality re- 
lated to the quotients on the universes into the equiprovability relation. Finally, 
the rule of uniqueness of equality proofs seems crucial to identify the values of 
the choice function applied to two suitable extensionally equal subsets. 

Of course, an analogous proof can be reproduced in the extensional version 
of Martin-L6f ’s set theory with the quotient sets as given in Nuprl [Con86] and 
only with the addition of the effectiveness condition. 

We know that the effectiveness condition is surely derivable for decidable 
equivalence relations. But in general effectiveness is problematic, because it re- 
stores information that has been forgotten in the introduction rule for the equal- 
ity of equivalence classes. This is confirmed by the proof given here. 

The interest in the effectiveness condition arises from the mathematical prac- 
tice of quotient sets. In order to keep effectiveness for quotient sets in the presence 
of uniqueness of equality proofs, an alternative strategy could be to let quotient 
sets based only on a proof-irrelevant equivalence relation, as it is in the type 
theory of Heyting pretoposes [Mai97] . 



2 The Idea of the Proof: Axiom of Choice versus 
Extensionality 



We describe the idea behind the proof that in the extension of Martin-L6f set 
theory with effective quotient sets and the uniqueness of equality proofs the ax- 
iom of choice yields classical logic on small sets. We think that this proof can go 
through any other possible extension with analogous extensional constructors. 
The idea of the proof originally due to Diaconescu [Dia75] can be clearly un- 
derstood in the framework of an intuitionistic set theory with basic axioms, as 
the empty axiom, the pair axiom and the comprehension axiom, also only for 
restricted formulas as in CZF [Acz78] (see e.g. [GM78,Bel97]). In this framework 
we can see how the axiom of choice is incompatible with the extensionality axiom 
from the constructive point of view, as we show in the following. 

Let us consider a set A and the following subsets of the set {0, 1}, where 0 = 0 
and 1 = {0}: 



Vb = {a; € {0, 1} : x = 0 V By y € A} Fl = {x € {0, 1} : x = 1 V By y € A} 
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Now, if we apply the axiom of choice to the system of sets {Vb, l^i} we get that 
the following proposition is true: 

€ {Vo,Vi} 3y G {0,1} y e z — > 

3f G {Vo,Vi} ^ {0,1} 'izG{Vo,V^} f{z)Gz 

Then we know that the premise of this implication is true by substituting y 
with 0 in the case of Vq and with 1 in the case of Vi. Therefore we derive by 
modus ponens 



3f G {Vo,Vi} ^ {0,1} VzG{Fo,Fi} f{z)Gz 
Then, applying the elimination of the existential quantifier, we can derive 
(/(Vb)=0 V 3y y G A) A {f{Vi) = l V 3y y G A) 
from which by distributivity we get 

if{Vo)=0 A f{Vi) = l) \/3yyGA 

Now we are going to prove by V-elimination from the above proposition the 
principle of excluded middle for A. So, at first we assume /(Vb) = 0 A f{Vi) = 1. 
Then note that if we also assume 3y y G A, from this by extensionality we get 
that Vb = Vi, which combined with our first assumption yields 0 = 1, which is 
falsum and lets us conclude -Ay y G A and also 3y y G A V -Ay y G A. Since 
by assuming the second disjunct 3y y G A we also get 3y y G A V ~^3y y G A, 
by V-elimination applied on (/(Vb) = 0 A /(Vi) = 1) y 3y y G A the principle 
of excluded middle for any set A 

3y y G A V —Ay y G A 

is now derived. We can adapt the outline of this proof to the extension of Martin- 
Lof’s set theory with effective quotient sets and the uniqueness of equality proofs, 
as we will show in the next sections. The uniqueness of equality proof seems cru- 
cial to reproduce the proof together with the extensionality captured by effective 
quotient sets. 



3 Extension of iTT with Qnotient Sets 

In order to investigate the possibility of an extension with effective quotient 
sets, firstly we extend the intensional version of Martin-L6f ’s Intuitionistic Type 
Theory [NPS90], here called MLTT, with quotient sets and the rule of uniqueness 
of proofs for the intensional propositional equality as in [Hof95] (page 111) and we 
call this extension MLTT^ . Then we consider its preservative extension iTT^ 
with true judgements. Lastly we extend iTT^ with the effectiveness condition 
and we call this extension iTT^^ . As said in the introduction the meaning of a 
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true judgement is the following: A true holds if and only if there exists a proof- 
element a such that a G A holds (for an account of this see [Mar84,Val95]). This 
is meaningful, since we identify propositions and sets. We call iTT the extension 
of MLTT with true judgements. The rules of iTT ( iTT^) about true judgements 
are precisely those admissible by the rules of MLTT (MLTT^) according to the 
explained semantics, to which we add the following introduction rule 

CL ^ A 

(True Introduction) — 

A true 

such that iTT (iTT^) turns out to be a preservative extension 
of MLTT (MLTT^) in the sense stated in [Val95] and recalled in the intro- 
duction. For instance, among the admissible rules of iTT, we recall the case of 
the set of intensional propositional equality Id. The propositional equality is the 
internalization of the definitional equality between elements of a set at the level of 
propositions, considering two objects definitionally equal if they evaluate to the 
same normal form. Actually, there are two kinds of propositional equality char- 
acterizing intensional and extensional type theories: Id, which is intensional (see 
the rules below), and Eq, which is extensional (see [NPS90] and the section 5). 
Intensional propositional equality is entailed by definitional equality, that is two 
objects are propositionally equal if they are definitionally equal, but the other 
way around does not hold. On the contrary, extensional propositional equality 
is equivalent to definitional equality. The main difference is that in the presence 
of intensional propositional equality, definitional equality and type checking are 
decidable, but this is no longer true in the presence of extensional propositional 
equality. 

The formation, introduction, elimination and conversion rules for the set Id 
are the following 

Intensional equality set 



I-Id 
E- Id 



A set a G A b G A 
ld(A, a, b) set 



a G A 

id(a) G ld(A, a, a) 

d G ld(A, a, b) c(x) G C(x, x, id(x))) [x : A] 
idpeel(d, c) G C{a,b,d) 



C-Id 



aG A c{x) G C(a;, x, id(a;)) [x : A] 
idpeel(id(a), c) = c(a) G C{a, a, id(a)) 



In particular, the admissible rules corresponding to the elimination rule are the 
following: 

[x : A] [x : A] 



deld(A, a,6) C{x,x,\d{x)) true 

C{a, b, d) true 



ld(A,a, 6) true C{x,x) true 
C{a, b) true 
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Now, we extend iTT with quotient sets as formulated in [Hof95]^: 

Intensional Quotient set 

R{x, y) set [x & A,y ^ A\ 

Cl & R{x,x)[x & A\, C2 G , x)[x G A,y G A,z G R(x,y)] 
C3 G R(x, z)lx G A,y € A,z € A,w € R{x, y),w' G R{y, z)] 

A/R set 

I-int. quotient 

a G A AjR set 
[a] G A/R 

eq-int. quotient 

ei G A b G A d G Ri^ci^ 6) 

Qax(d) G \ 6 {A/R, [a], [6]) 

E- i nt . quot ie nt 



s G A/R l{x) G L{[x\)[x G A] 

hG\d{L{[y]), sub(Qax(d), /(a:)), l{y)) [x G A,y G A, d G R{x,y)] 

Q(/, h, s) G L{s) 

where the term sub(c, d) = idpeel(c, (a;)A?/.y)((i) for c G ld(A,a, 6) and d G L{a) 
(see also [NPS90] page 64) expresses substitution with equal elements; 

C-int. quotient 

a G A l{x) G L([a;])[a; G A] 

hG\d{L{[y]), sub(Qax(d), /(a;)), l{y)) [x G A,y G A, d G R{x,y)] 

Q{ 1 , h, [a]) = l{a) G L{[a\) 

We also want to make quotients effective and we require: 

Effectiveness condition 

qGA bGA \d{A/ R,[a],[b]) true 
R{a, b) true 

Effectiveness expresses the fact that, as usual, every equivalence relation on a 
set A is the kernel of the function which maps an element of A into its equivalence 
class. 

Note that effectiveness is expressed only as a condition in terms of true 
judgements, since we are not able to exhibit type-theoretical rules that make 
this effectiveness condition admissible, like for the rules of iTT^ on true judge- 
ments, where by a type-theoretical rule we mean a rule expressed using judge- 
ments only of the following four kinds: A set A = B a G A a = b G A. 
Indeed, in iTT^^ we will prove a non-constructive principle, that is the prin- 
ciple of excluded middle on small sets, which lets us conclude that there are 

^ But we restrict the formation rule to quotient sets based on equivalence relations. In 
A/R we should record the proof terms ci, C2, C3 and then the corresponding equality 
rule should say that varying ci, C2, C3, the set A/R is the same. 
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no type-theoretical rules that make the effectiveness condition on quotient sets 
admissible and that in the same time follow the Heyting constructive semantics 
of connectives. 

Finally, we add the rule of uniqueness of propositional equality proofs: 

Id-Uni I 

a ^ A p G ld(^, a, a) 
iduni(a,p) € ld(ld(A, a, a),p, id(a)) 

The corresponding conversion rule is the following: 

Id-Uni conv 

a G A 

iduni(a, id(a)) = id(id(a)) G ld(ld(A, a, a), id(a), id(a)) 

By using Id-Uni and the elimination rule of the propositional equality on the 
proposition 

^u;Gid(A.x,y)ld(ld(A,a;,y),-u;,2:) [x G A,y G A,z G \d{A,x,y)] 

Streicher proved that (see [Hof95] page 81) the set 

ld(ld(A, X, y),w,z) [x G A,y G A,z G ld(A, x,y),w G ld(A, x, y)] 
is inhabited by the proof-term 

idpeel(z, (a;)Azi;' G ld(A, x, a;).iduni(a;, w'))(w) 



Hence, the uniqueness of proofs of propositional equality set, called UIP, holds. 

Remark 1. As we said in the introduction, the uniqueness of proofs of the propo- 
sitional equality set is definable by pattern-matching [Coq92] , but it is not deriv- 
able in the intensional version of Martin-L6f ’s set theory, as showed by M. Hof- 
mann and T. Streicher (see [HS95]), who built a model where UIP is not valid. 

Finally, we consider the first universe C/q, whose elements are called small 
sets [NPS90], and the second universe U\, whose elements are called large sets 
and where Uq is also coded (see [Mar84] and [Dyb97] , but note that we do not give 
a new code to terms of the first universe into the second one to make formulas 
more readable in the following). We have also to add the following introduction 
rules for the codes of the quotient sets into the universes for i = 0, 1 
UQ-I 

aGUi r{x,y) G Ui [x G Ti{a),y G Ti{a)] 

Cl G Ti{r{x,x)) [x G Ti{a)], ci G Ti{r{y,x)) [x G Ti{a),y G Ti{a),z G Ti{r{x,y))] 

C3 € Ti{r(x,z)) [* € Ti{a),y G Ti{a),z G Ti{a),w G Ti{r{x,y)),w' G Ti{r{y,z))] 

a^jr G Ui 

with the corresponding conversion rules 



Ti{a/r) = T,{a)/{x,y)Ti{r{x,y)) 
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This extension of iTT, called iTT^^ , is consistent because there is an interpre- 
tation of iTT^^ into classical set theory (ZFC) with two strongly inaccessible 
cardinals. Indeed, we interpret the quotient sets in classical quotient sets and 
the first two universes respectively in the set of small sets and in the set of large 
sets, proved to be actual sets by the presence of the two strongly inaccessible 
cardinals. 



4 Small Sets Are Classical 

We are going to prove that for small sets in iTT^^ the principle of excluded 
middle holds, i.e. for any element a of the first universe C/qj the judgement 
To(a) V ^To(a) true holds. This is a consequence of a particular application of 
the axiom of choice (AC). In topos theory the fact that AC implies the princi- 
ple of excluded middle was first proved by Diaconescu [Dia75]. The same result 
is obtained in [MV99] within an extension of iTT with a powerset constructor 
by adapting the logical proof of [Bcl88] about Diaconescu’s theorem. Also Hof- 
mann in [Hof95] claimed that the same result can be obtained in the Calculus of 
Constructions by adding proof-irrelevance at the level of propositions, equiprov- 
ability as equality between propositions and extensionality as equality between 
dependent propositions. 

Here, we show that we can recover this proof in a predicative setting with 
effective quotient sets instead of an impredicative one like a topos. The key 
point is to simulate the powerset, by quotienting the first two universes under 
the relation of equiprovability among their elements. 

Also in iTT^^, the so called intuitionistic axiom of choice 

((Vx e A){3y e B) C{x, y)) ((3/ G A ^ B){Vx G A) C{x, f{x))) true 

is proved by disjoint union sets, exactly as in [Mar84], concluding by true intro- 
duction. 

We are going to use the axiom of choice on the quotients made out of the 
first two universes under the equivalence relation of equiprovability, i.e. 

To(x) ^ To{y) set [x GUo,y G Uq] Ti(x) ^ Ti{y) set [x GUi,y G Ui] 

Let us put the following abbreviations for i = 0, 1 

I?* = Ui/ (x,y)Ti{x) ^ Ti{y) 

Since Jdiere is a code for Uq in Ui, i.e. Uq G Ui, then there is inside Ui the 
code l7o for Co such that 

Ti{f^o) = 

The reason to use the two universes is due to the possibility of deriving 

rd(Co, 2 ,[t]) € Ui [zG^o] 
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where T is the singleton set (see [NPS90]). We use the abbreviation a =a b for 
ld(A, a, b), when it is not coded in a universe. 

Moreover, if A is a set, we will often write A to mean the judgement A true. 
We also recall (see [NPS90]) that, in the presence of Uq, we can derive 

^(true =Booi false) 

Now, we go on to show the claimed proof of the principle of excluded middle on 
small sets. As in [MV99], one of the key points is to internalize the truth of sets 
within the quotients on the universes, simulating the powersets. This is expressed 
by the following lemma, which is provable by the introduction equality rule on 
the quotient set in terms of true judgements and by the effectiveness condition. 

Lemma 1. For z = 1,2 and any set a G Ui, [a] =a- [T] iff Ti(a) true. 

Proof. From [a] =q. [T] true by effectiveness of quotient sets we get 
Ti(a) <-> 7j(T) true, but Ti(T) = T so Ti{a) true. On the other hand, 
from Ti{a) true, we get Ti{a) 7i(T) and by the true version of the equality 
rule on the quotient set we conclude [a] =Oi [T]. 

■ 

Now, we consider the following abbreviations: for z G f2o 

E{z) = \d{no,z,[f]) 



Hence, we prove: 

Proposition 1. In iTT^^ the following proposition 
{'izGSyjeagxOg\E{ni{w))</ E{7r2{w))] |f]) 

(3a;e Bool) (a; =Booi true — > F;(7ri(7ri(z)))) A (x =booI false F;(7r2(7ri(2)))) 
is true. 

Proof. Suppose z e [E{ni{u^ E{n 2 iw)y\_^^ [f]. Then 711 ( 2 ) e 

J7o X f^o and 712 ( 2 ) is a proof of [i?(7ri(7ri(2)))V £^( 772 ( 711 ( 2 )))] =Oi [T]. Thus, 
by lemma 1 and by the conversion rules for Ui, E {ni { it i{z))) V £( 772 ( 711 ( 2 ))). 
The result can now be proved by V-elimination, by putting for example x = true 
in the case E{tti{tti{z))) true. 

■ 

Thus, we can use the intuitionistic axiom of choice to obtain: 

Proposition 2. In iTT^^ the following proposition 

(3/ e £u,gi2oxr2o [£(7T]X^V E{TT 2 j,w)y\ =n^ [T] ^ Bool) 

(V2 € £„,Gi 7 oxr 2 o [£( 7 Ti(u>))V £( 7 T 2 (w))] [f]) 

(/(^:) =Booi true ^ £(711(771(2)))) A (/(2) =booI false ^ £(772(771(2)))) 



is true. 
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Suppose, now, that a G Uq is the code of a small set; then 

(([a],[T]),Qax((A?/.*, At/'. inr(id([f ]))))) 

is an element of the set 

SmeOoxno [E{ni{w))\/ E{n 2 {w))] =n^ [T] 

where * e T is the only element of the singleton set. In fact, ([a], [T]) G x 
and 

(Ay.*, Ay'.inr(id([t]))) G ld(f?o, [a], [T]) V ld(f?o, [T], [f ]) ** T 
from which, since 

ld(f?o,H,[f])Vld(f2o,[T],[t])**T = Ti(.^)V i([T])) e*Ti(t) 
by the equality rule on the quotient set we get 

Qax((Ay.*,Ay'.inr(id([f])))) G [:B^)V i^)] =r 2 , [t] 
Analogously, 

(([T], [a]), Qax((Ay.*, Ay'.inl(id([t]))))) 
is an element of the set 

Emenoxno [E{ni{w))\/ E{tt 2 {w))] =n^ [T] 

Let us put for w G f2o 

qi(w) = {{w, [f]),Qax((Ay.*, Ay. inr(id([f ]))))) 



and 

q 2 (w) = (([t],u;),Qax((Ay.*, Ay'. inl(id([f ]))))) 

Now, let / be the choice function obtained by 3-elimination rule on the judge- 
ment in the proposition 2; then /(qi([a])) =booI true ^'([a])- But 

(/(qi([a])) =Booi true) V (/(qi([a])) =booI false) 

since the set Bool is decidable (for a proof see [NPS90], page 177), and hence, 
by V-elimination, lemma 1 and a little intuitionistic logic, one gets that 

(1) To(a) V (/(qi([a])) =Booi false) 
and in an analogous way 

(2) ro(a) V (/(q 2 (H)) =booI true) 

Thus, by using distributivity on the conjunction of (1) and (2), one finally obtains 



174 



Maria Emilia Maietti 



Proposition 3. For any small set a G Uq in iTT^^ the following proposition 

(3/ e SwGOoxOo [E{ni{w))\/ E{tt 2 {w))] =ni [T] ^ Bool) 

To(a) V (/(qi([a])) =Booi false) A /(q 2 ([a])) =booI true) 



is true. 

Now, we proceed by 3-elimination assuming for some proof-term / 

To(a) V (/(qi([a])) =booI false) A /(q 2 ([a])) =booI true) 

on which we are going to apply V-elimination to prove the principle of excluded 
middle for To{a). 

But, first of all, note that if we assume To(a) true then [a] =r 2 „ [T] true by 
lemma 1 and hence 

qi([a]) =E(noxf2o,...) qi([T]) 

by the elimination rule of the intensional propositional equality with respect to 
the proposition 



qi(a;) =s{nQxno,...) qi(y) [x G f2o,y G f?o] 



Thus, /(qi([a])) =booI /(qi([T])) and in a similar way from the same assumption 
we can also prove 

/(q2(H)) =Booi /(q 2 ([T])) 

Hence, since by the uniqueness of propositional equality proofs UIP we get a 
proof-term of 



^ 2 (qi([T])) 



[iJ([t])v is([t])]=„, [t] 



’r2(q2([f])) 



as 7Ti(qi([T])) = ([T], [T]) = 7Ti(q2([T])), we conclude by the elimination rule of 
the propositional equality that 

qi([T])) =s(noxno,...) q 2 ([T]) 



and therefore by transitivity 



/(qi(H)) =Booi /(q 2 ([a])) 



Then if we also assume 

(/(qi([a])) =Booi false) A (/(q 2 ([a])) =booI true) true 

we conclude true =booI false true. But we know that we can derive an element of 
^(true =Booi false). Hence, under the assumption 

(/(qi(H)) =Booi false) A (/(q 2 ([a])) =booI true), 
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the judgement ^To(a) true holds. So, from proposition 3, by 3-elimination and 
by V-elimination applying V-introduction when the first disjunct is assumed and 
using the above argument when the latter disjunct is assumed, we can conclude 
(To (a) V ^To(a)) true and 

■^aGt/o To{a) V ^To(a) true 

To sum up the key points to reproduce the proof of the principle of excluded 
middle on small sets are the following: 

— we use the axiom of choice, by quantifying on 

[T(7Ti(m;))V E{tt2(w))] =r2i [f] 

instead of E{tti{w)) V E(tt 2 {w)) in order to forget the proof-term 

of the disjunction and hence we need the second universe to encode 

E{z) = \d{Qo,z,[f]) [zef?o] 

and to express at the propositional level when it is true; 

— we exhibit a proof-term qi by means of the equality rule on the quotient set 
such that for a G Uq 

qi([a]) G T'fioxrio [E{tti{w))\/ E{Tr 2 {w))] [T] 

in order to prove under the assumption [a] =Qg [T] true 
qi([a]) true and q 2 ([a]) q 2 ([T]) true 

— we use the uniqueness of propositional equality proofs in order to prove 

qi([t]) q 2 ([t]) 

In conclusion, if we had type-theoretical rules that make all the rules of iTT^^ 
admissible such that we can prove that C true holds in iTT^^ if and only if 
there exists a proof element for the proposition C, then we would have a proof 
element for the proposition IlaeUoToio.) V ^To(a), which is expected to fail for 
small sets, according to an intuitionistic explanation of connectives. 

5 Extensional Quotient Sets in Extensional Set Theory 

The proof that effectiveness of quotient sets yields classical logic for small sets 
can also be done within the extensional version of Martin-Lof’s Intuitionistic 
Type Theory with true judgements, called eTT, extended with the rules for 
quotient sets, as in Nuprl [Cou86], to which we add the effectiveness condition 
and the introduction and conversion rules of the codes for quotient sets into the 
first two universes. 
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About the rules of true judgements, we only recall the case of the set of the 
extensional propositional equality Eq (see [NPS90]). The formation, introduc- 
tion, elimination and conversion rules are the following: 

Extensional Equality set 



Eq) 



C set c G C d G C 
Eq(C, c, d) set 



I-Eq) 



cGC 

eqc(c) € Eq(C,c,c) 



E-Eq) 



p G Eq(C,c,d) 
c = dGC 



C-Eq) 



p G Eq{C,c,d) 
p = eqc(c) e Eq(C,c,d) 



In particular the elimination rule yields the admissibility of the following rule 
on true judgements: 

Eq(A, a, b) true 
a = b G A 

We extend eTT with the rules of extensional quotient sets: 

Quotient set 



i?(a:, y) set [x G A^y G A\ 

Cl e R{x, a;) [a; S A], C 2 G R{y, x){x G A, y G A, z G R{x, y)] 
C3 e R{x, z)[x G A,y G A, z G A,w G R{x, y),w' G R{y, z)] 

A/R set 



I- quotient 



eq- quotient 



E-quotient 



a G A A/R set 
[a] e A/R 

CL G A b G A d G R(^cl^ 6) 
[a] = [6] G A/R 



s G A/R l{x) G L{[x]) [a; G A] l{x) = l{y) G L{[x]) [x G A,y G A,d G R{x,y)] 

Q{1, s) G L(^s) 



C-quotient 

a G A l{x) G L{[x]) [x G A] l{x) = l{y) G L{[x]) [x G A,y G A,d G R{x,y)] 

Q(/, [a]) = l{a) G L{[a\) 

Then we make extensional quotients effective through the following condition in 
terms of true judgements: 

Effectiveness condition 

aG A b G A [a] = [6] € A/R 
R{a, b) true 

We also add the codes of quotient sets in the introduction rules of the first two 
universes and their corresponding conversion rules, as in section 3. Note that. 
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like for the intensional propositional equality set, the introduction of equality on 
quotient sets yields the admissibility of the following rule: 

a G A b G A R{a, b) true 
[a] = [5] G A/ R 

This extension of eTT, called eTT^’^ , is consistent, because there exists an inter- 
pretation in classical set theory (ZFC) with two strongly inaccessible cardinals. 
In the presence of the extensional propositional equality set, the rules for inten- 
sional quotient sets become equivalent to those of extensional quotient sets and 
the same holds with respect to the effectiveness condition. So, we can reproduce 
in eTT^^ the proof of the previous section to derive 

IlaGUoToia) V ^To(a)) true 

which is expected to fail for small sets. 

Note that this proof can not be recovered in the extensional version of set 
theory with effective quotient sets restricted to mono equivalence relations, that 
is equivalence relations inhabited by at most one proof. This kind of quotients is 
operating in the extensional type theory of Heyting pretoposes [Mai97] and also 
of toposes [Mai98], where even effectiveness can be type-theoretically expressed. 

I would like to thank Silvio Valentini, Peter Aczel and Giovanni Sambin for 
helpful discussions that stimulated the investigation on this topic, Peter Dybier 
for his comments on a preliminary version of this paper and lastly the referees 
for their valuable suggestions. 
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1 Introduction 

Notational definitions are pervasive in mathematical practice and are therefore 
supported in most automated theorem proving systems such as Coq [B+98], 
PVS [ORS92], Lego [LP92], or Isabelle [Pau94]. Semantically, notational defi- 
nitions are transparent, that is, one obtains the meaning of an expression by 
interpreting the result of expanding all definitions. Pragmatically, however, ex- 
panding all definitions as they are encountered is unsatisfactory, since it can be 
computationally expensive and complicate the user interface. 

In this paper we investigate the interaction of notational definitions with 
algorithms for testing equality and unification. We propose a syntactic crite- 
rion on definitions which avoids their expansion in many cases without losing 
soundness or completeness with respect to /3i5-conversion. Our setting is the de- 
pendently typed A-calculus [HHP93], but, with minor modifications, our results 
should apply to richer type theories and logics. 

The question when definitions need to be expanded is surprisingly subtle and 
of great practical importance. Most algorithms for equality and unification rely 
on decomposing a problem 



cMi...Mn = cNi...Nn 

into 

Ml = fVi, . . . , M„ = W„. 

However, if c is defined this is not necessarily complete. For example, if A: = \x. c' 
then 

k M = k N 

for every M and N. Always expanding definitions is computationally expensive, 
especially when they duplicate their arguments. Expanding them only when the 
equality between the arguments fails, often performs much redundant computa- 
tion, and, moreover, is incomplete in the presence of meta- variables. For example, 
with the same definition for A:, 



kX = k c' 

* This work was supported by NSF Grant CCR-9619584 
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would succeed without expanding k with the substitution X = c' , even though 
the most general unifier leaves X uninstantiated. 

We identify a class of definitions (called strict) for which decomposition 
is complete. It also solves a related problem with the completeness of the so- 
called occurs-check during unification by generalizing Huet’s rigid path crite- 
rion [Hue75]. Fortunately, most notational definitions are strict in the sense we 
define. We do not deal with recursive definitions, for example, which require dif- 
ferent considerations and have been treated in the literature on functional logic 
programming [Haii94]. Other aspects of notational definitions in mathematical 
practice have been studied by Griffin [Gri88]. 

We have implemented a strictness checker and unification algorithm in 
Twelf [PS98], an implementation of the logical framework LF which supports 
type reconstruction, logic programming, and theorem proving. It has been ap- 
plied to a variety of examples from the area of logics and programming languages. 
The Twelf system is freely available from the Twelf homepage 
http ; //www. cs . emu. edu/~twelf . 

This paper is organized as follows. In Section 2 we describe a spine formula- 
tion of LF with definitions, and in Section 3 a small logic as running example. 
In Section 4 we describe the strictness criterion and show its correctness. We 
generalize our results from conversion to unification in Section 5 and conclude 
and describe future work in Section 6. 



2 Language 

The type theory underlying the logical framework LF [HHP93] is divided into 
three levels: objects, types, and kinds. We deviate from standard formulations 
by adopting a spine notation for application [GP97] and by adding definitions. 
In spine notation, we write c • Mi; ...; M„; nil for a term c M\ ... M„ to make its 
head explicit. It contributes significantly to the concise presentation of the theory 
in Section 4 and corresponds closely to the implementation in Twelf. We use a 
for constant type families, x for object-level variables, and c for constructors 
(that is, declared constants without a definition) and d for defined constants. 
For simplicity, we only allow definitions at the level of objects, but the results 
also apply to definitions at the level of types. 



Kinds: 


K ::= type | Ux'.A. K 


Types: 


A ::= a ■ S \ Ux'.Ai. A 2 


Objects: 


M ■.■.= H -S \ Xx:A.M \ M -S 


Heads: 


H ::= x \ c \ d 


Spines: 


S:-.= m\\M-S 


Signature: 


E ■.■.= ■\E,a-. K \ S,c: A\E,d-. A = M 


Gontexts: 


r \ r,x ■. A 



a ■ S and H ■ S are our notation for the application of a variable or constant to 
arguments given as a spine. Such terms are in weak head-normal form unless the 
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constant at the head is defined. For the sake of readability, we omit the trailing 
nil from spines, and if the spine is empty, we also omit the Ux:Ai. A 2 is a 
function type, which we may write as Ai A 2 if x does not occur free in A 2 . 
In the examples we sometimes omit types and write definitions as d = M . 

As in [CP97] we assume throughout that all objects are in 77 -long form. Note 
that 77 -long forms are preserved under /3i5-conversion. Working only with 77 -long 
forms simplifies the presentation of the formal judgments and proofs, but is not 
essential. Our results still hold if we drop this assumption, both with and without 
77 -conversion. The notion of definitional equality is then based on /35-conversion 
where 5-reduction expands definitions. 



M ■ nil — M 

(Xx:A.M) ■ (/V; S) — >/3 {[N/x]M)-S 

d ■ S — >s AI ■ S where d : A = M G S 

A /3-redex has the form M ■ S, a 5-redex the form d ■ S. 

We assume that constants and variables are declared at most once in a sig- 
nature and context, respectively. As usual we apply tacit renaming of bound 
variables to maintain this assumption and to guarantee capture-avoiding substi- 
tution. 

The LF type theory is defined by a number of mutually dependent judgments 
which define valid objects, types, kinds, contexts, and signatures, and, in our 
case, also heads and spines. We will not reiterate the rules here 
(see [HHP93,CP97]). The main typing judgments are of the form F \-^ M ■. A 
— expressing that object M has type A in context F — and F \-^ S : A > A' — 
expressing that the spine S acts as a vector of well-typed arguments to a head 
of type A returning a result of type A' . A definition d : A = AI is well-formed 
in a signature E if ■ AI : A. 

We generally assume that signature E is valid and fixed and therefore omit 
it from the typing and other related judgments introduced below. We take /35- 
conversion as our notion of definitional equality which guarantees that every 
well-typed object has an equivalent normal form. Since we also assume that 
every object is in 77 -long form these normal forms are long 77 /35-normal forms. We 
write AI AI' for weak head reduction which applies local j3- or 5-reductions. 

We write F h M\ = AI 2 to express that two well-typed objects Mi and AI 2 
are equivalent modulo /35-conversion. Similarly, for spines, we write F \- Si = S 2 - 

Since all validity judgments are decidable with well-understood algorithms, 
we tacitly assume that all objects, types, kinds, spines, heads, contexts, and 
signatures are valid and, for equalities, that both sides have the same type or 
kind. 

Our proofs exploit the following standard properties of definitional equality 
based on /35-conversion. 
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Property 1 (Equivalence). 

1 . E'r M = M. 

2 . For all Hi, H2 of the form x or c, 

F h Ffi • 51 = i/2 • S'2 iff Hi = H2 and E ^ Si = S2 

3. E \- Qi ■ Si = Q2 ■ S2 iff ai = 02 and F h 5 i = ^2 

4 . E \- Xy.Ai. Ml = Xy:A2- M2 iff F h = A2 and E, y : Ai \- Mi = M2 

5 . E \- Hy. Ai. Bi = Hy. A2- B2 iA E \- Ai = A2 and E,y \ Ai\- Bi = B2 

6 . For all Mi, M2 in which y does not occur free, 

F, y : A h Ml • y EE M2 • y iff F h Ml = M2 

7 . F h Mi; 5 i = M2; 52 iff F h Ml = M2 and F h 5 i = 52 

8 . If Ml ^ M[ and M2 ^ M!2 then F h Mi = M2 iff F h M{ = M^ 

For a well-typed definition d : A = M the head-normal form of M must al- 
ways exist and have the shape M = Xxi '.Ai. ... Xxn - An - H-S. We call xi,. . . ,Xn 
argument parameters, and all other parameters in the body H ■ S local parame- 
ters. 

3 Example 

To illustrate our algorithms we use the encoding of a small fragment of propo- 
sitional intuitionistic logic in LF [HHP93]. 

Formulas: F ::= T |_L| Fi D F2 

Formulas are represented as a type and each connective as a constant. 

o : type 

' I ' = true true : o 

= false false : o 

'"Fi D F2~' = imp • C~Fi~'; '~F2~') imp : o ^ o ^ o 

This simple logic can now be extended by negation in the usual way, by 
defining ^F F dT, which leads to a definition of the constant not in terms 
of the other constants. 



not : o ^ o = AF :o. imp • (F; false) 



We write F F to express that the formula F has a natural deduction, using the 
following four rules: 



h F 



F T 



■TI 



FT 
F F' 



F G 



E 



OM 



F F D G F F 
F G 



■ DF 



F F D G 
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As shown in [HHP93], there is an adequate encoding of this calculus in LF. 
The judgment h A is represented as a dependent type family, and the four rules 
as object constants. 

nd : o ^ type 
true! : nd • true 

falsee : 7JF : o. nd • false ^ nd • F 

impi : UF:o. UG.o. (nd • F ^ nd • G) — > nd • (imp • (F; G)) 
impe : UF:o. IIG .o. nd • (imp • (F; G)) ^ nd • F — > nd • G 

The usual introduction and elimination rules of ^F can then be formulated as 
derived rules of inference. 



u 

h F 

h ^F h F 

<r F 

h ^F FT 

Clearly, is a restriction of D/“ and ~^E is a restriction of DF. We repre- 
sent these rules as defined constants in LF. This is an example of a notational 
definition at the level of derivations. 

noti : UF:o. (nd • F ^ nd • false) ^ nd • (not • F) 

= AF :o. Xu: (nd • F — *■ nd • false) . impi • (F; false; u) 
note : iTF : o. nd • (not • F) ^ nd • F — > nd • false 

= AF: o. Aiti :nd • (not • F). Xu 2 :nd • F. impe • (F; false; ui; M 2 ) 

4 Definitions and Algorithms for Equality 

In this paper we study only notational definitions. We do not explicitly treat 
other forms of definitions, such as recursive definitions, but our techniques are 
applicable in more general circumstances. For example, in MLF [HP98] — an 
implementation of LF extended with a module system — definitions are used to 
express logical interpretations. 

Semantically, definitions are transparent, that is, the meaning of any term 
can be determined by expanding all definitions. But from a pragmatic point of 
view expanding all definitions is unsatisfactory for several reasons. First of all, 
even if the definitions are simple, their expansion is likely to be required fre- 
quently, in the core of an implementation. Secondly, definitions can duplicate 
their arguments, leading to a potential explosion size unless special implemen- 
tation techniques are employed. Thirdly, expanding all definitions means that 
error messages and other output are often rendered illegible. 

In this section we characterize a class of definitions whose expansion can 
frequently be avoided when comparing terms for equality. Based on these results, 
we show in the next section that the same criterion can be used to even greater 
benefit in unification. 
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4.1 Injectivity 

Most algorithms for equality and unification rely on decomposing a problem 

H-Si = H-S2 ( 1 ) 

into 

5i = ^2 (2) 

but \i H = d and d \ A = M is a notational definition, then (1) stands for 

M-Si = M-S2- (3) 

Since = is a congruence, it follows trivially that (2) always implies (3). But 
the reverse does not necessarily hold, for example, if M ignores an argument. 
We call those terms M for which (3) implies (2) injective. For definitions which 
are injective, decomposition is complete. Recall that we assume all signatures, 
context, objects, equations, etc. to be valid. 

Definition 1 (Injectivity). A definition d : A = M is injective iff for all 
contexts A and spines S\ and S2, 

A\- M ■ S\ = M ■ S2 implies A \- Si = S2. 



4.2 Strictness 

Many algorithms for equality avoid expanding definitions in equations of the 
form d- Si = d - S2 until the equality of the arguments Si = S2 fails. If that hap- 
pens, definitions are expanded, and the algorithm continues with the expanded 
terms, probably redoing much previous computation. Without further improve- 
ments such an algorithm could be exponential for first-order terms and worse at 
higher types. In contrast, if we know that d is injective, the algorithm can fail 
immediately. 

Since injectivity is a semantic criterion, we have developed a syntactic cri- 
terion called strictness which guarantees injectivity and which can be easily 
checked. Informally, a notational definition is said to be strict, if each argu- 
ment parameter occurs at least once in a rigid position [Hue75], applied only to 
pairwise distinct local parameters. If there are no defined constants, the rigid 
positions in a /3-normal form are those resulting from erasing the spines following 
argument parameters. If there are defined constants we distinguish (inductively) 
between strict and non-strict ones: the former are treated like constructors, the 
latter are expanded. We also do not consider the head of a definition to be a 
rigid position (see Example 2). Our notion of strictness is a crude approximation 
of the notion of strictness found in functional programming. 

The definition of not, for example, is strict, because F appears in a rigid 
position, noti is also strict, because its argument parameters F and u occur in 
rigid positions. The same holds for note, because F, ui, and U2 occur in rigid 
positions. 

In the following we analyze some counterexamples to illustrate strictness and 
its relation to injectivity. 
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Example 1 (Universal quantification). The logic presented in Section 3 can be 
extended to first order by introducing terms T and a universal quantifier 

F ;:= ... I Wx.F 

In LF, terms are represented by objects of a new type i, and the universal 
quantifier by a new constructor 

forall : (i — > o) — > o. 

The (true) formula (Va;.F(a;)) D F{t) can be defined as 



allinst = XF : i ^ o. XT : i. imp • (forall ■ F-, F ■ T) 

allinst is not strict because T does not occur in a rigid position, even though F 
does. Indeed, if F{x) does not actually depend on x, then t is not uniquely 
determined and 

allinst ■ {F;T) = allinst • {F; T') 
holds even if T and T' are different. 



Example 2 (Identity). The definition of the identity at function type, id = XF\ 
0 ^ 0 . XG :o.F ■ G, is not strict for two reasons: the only occurrence of F is at 
the head of the definition, and the only occurrence of G is as an argument to F. 
It is also not injective, because 

id • {XF : o. true; false) = id • {XF : o. true; true) 



can be reduced to 



true = true. 



Example 3 (Identity at base type). The definition id^ = XF : o. F is not strict 
since F occurs at the head of the definition. However, the identity at base type 
is injective. We must rule it out for different reasons (see the discussion of the 
occurs-check in unification in Section 5) . 

Example f (Application to constant). Consider at = XF :o — *■ o. not • {F ■ true). 
Note, that the argument to F is not a local parameter but a constant. The 
definition is hence not strict. The equality problem 

at • {XF : o. F) = at • {XF : o. true) 



can be expanded to 

{XF :o — > o.not • {F ■ true)) • {XF:o.F) 

= {XF\o — > o.not • {F ■ true)) • {XF:o. true) 



which holds because not • true = not • true. Hence, the definition is not injective. 
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The first part in the definition of strictness formalizes the requirement that 
arguments to rigid occurrences of argument parameters must be pairwise distinct 
local parameters. This is exactly the requirement imposed on higher-order •pat- 
terns [Mil91]. In the judgments below we generally use F for a context consisting 
of argument parameters to a definition, and A consisting of local parameters. 

Definition 2 (Pattern spine). Let A be a context, S be a spine. S is a pattern 
spine iff A\- S pat holds which is defined by the following rules: 

A\, A2\- S pat 

ps.nil ps-cons 

A h nil pat Ai, x : A, A2 \- x; S pat 



The formal system for strictness is defined by four mutually dependent judg- 
ments. The central judgment of local strictness, F; A \~x M , enforces that the 
argument parameter x occurs in a rigid position in M where it is applied to a 
pattern spine. Every argument parameter must be locally strict, which is en- 
forced by global strictness, F B- M. As an auxiliary judgment we use relative 
strictness, F M where the leading abstractions in M are treated as argu- 
ment parameters. /3-redices and d-redices involving non-strict defined constants 
are reduced by M — > M' . 



Definition 3 (Strictness). Let F be a context of argument parameters, and A 
a context of local parameters. We define 

M — > M' M weak head-reduces to M' 

F; A\~x M X is locally strict in M 
F hx M X is strict in M 

F F M M is strict 



by the rules in Figure L We say that the definition d: A = M is strict if ■ h M 
holds. 



The main technical contribution of this paper is that strict definitions are 
injective. The proof is non-trivial and requires a sequence of properties sketched 
below. 

Lemma 1 (Pattern spines). Let S be a spine s.t. A\- S pat and Mi and M2 
be objects valid in F disjoint from A. 

If F, Ah Ml- S = M2- S then F h Mi = M2 

Proof. By induction over the derivation of Z\ h S' pat. 

Using inductions over local, relative, and global strictness, we can then show 
the completeness direction of our claim for strict d : A = M: 

Fh M ■ Si = M ■ S2 implies F h Si = S2. 

We cannot prove this directly by induction, but must generalize to the follow- 
ing lemma which requires substitutions a. We use standard notation for sub- 
stitutions, which must always be the identity on local parameters (usually de- 
clared in A). Because of possible dependencies, a substitution which maps vari- 
ables in F to objects with variables in F' will map a parameter context A to 
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Fig. 1. A formal system for strictness 



a context Z\' where each declaration y: A in A is mapped to y : A[a\. We 
write r' ; A' \- a : F; A for valid substitutions. 

Lemma 2 (Completeness). Let cri,a2 by substitutions which satisfy F'\A' h 
(Ti : F; A and F'; A' h tT2 : F; A, respectively. 

L If F] A \~x M and F' , A' h = M[a2] then F' h ai{x) = ct2(x). 

2 . If F\ A\~x S and F',A' h = S'[o'2] then F' h cri(a;) = a2{x). 

3 . If F hoc M and F' h M[(Ti] • S = M[ct2] • S then F' h tTi(a;) = a2{x). 

4 - If F h M and F' h = M[(T2] ■ S2 then F' h S\ = 82. 
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Proof. The four parts are proven by simultaneous induction over the given strict- 
ness derivations, using Lemma 1 and Property 1. 

As an immediate corollary, strictness is a sufficient criteria for injectivity. 

Theorem 1 (Injectivity). If d \ A = M is strict, that is, ■ lb M, then 
d:A = M is injective. 

Proof. Using Lemma 2, part 4, for ai = a 2 = id 

The rules of strictness implicitly define an algorithm to decide if a defini- 
tion is strict or not. The algorithm traverses the structure of a term visiting all 
rigid positions. If it finds at least one occurrence of every argument parameter 
of the definition applied to a pattern spine (ls_pat), it stops and signals success. 
If the algorithm comes to a defined and strict constant, it applies ls_d or rs_d, 
otherwise it expands the definition using ls_red or rs_red, respectively. The algo- 
rithm terminates for ls_red and rs_red, because definitions cannot be recursive. 
In an implementation of this algorithm, one would annotate each definition with 
strictness information, and hence no redundant computation is necessary for ls_d, 
rs_d, and nr_delta. A minor variant of this algorithm has been implemented in 
the Twelf system [PS98]. 

It is easy to verify that all definitions from Section 3 satisfy the strictness 
condition. Definitions at base type are always strict. Definitions in normal form 
whose argument parameters are of base type are strict if each argument param- 
eters occurs and it is not the identity. Most notational definitions of these two 
forms are thus accepted by our criterion. 

At higher types, one more frequently encounters definitions which are not 
injective. Consequently, they cannot be strict according to our definition. A more 
accurate extension would have to analyze the structure of functional arguments 
to higher-order definitions, as in the case of strictness analysis for functional 
programming languages (see, for example, [HM94]). However, we suspect one 
quickly reaches the point of diminishing returns for this kind of complex analysis. 

5 Results for Unification 

So far we have shown how algorithms for testing equality (that is, /3(i-converti- 
bility) can be improved by using strictness. In the presence of meta-variables 
these observations can be generalized to unification. We write W; A\-^ Mi « M 2 
for a unification problem, where Mi , M 2 are well-typed objects of the same type 
which can contain meta- variables declared in I'. All other parameters which are 
not subject to instantiation are declared in A. So this corresponds to a 3V prefix 
of a unification problem. 

Deciding when to expand definitions is in this setting more subtle than for 
plain equality algorithms. Expanding them only in the case of failure may return 
a unifier which is not most general and hence renders the algorithm incomplete. 
Not expanding them may cause an unnecessary occurs-check failure, yet another 
source of incompleteness. The following two examples show these situations. 
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Example 5 (Most-general unifier). Let tr : o ^ o = XF : o. true a definition, 
and X a meta variable. The unification problem X : o; • h tr • false « tr • X has 
as solution 0 = false/X if tr is not expanded. Obviously, this solution is not 
most general, since the most general solution leaves X uninstantiated. 

Example 6 ( Occurs- check). Let tr be the same definition as above, and X a meta 
variable. The unification problem X : o; • \- X ps tr ■ X has no solution if tr is 
not expanded, because X occurs on its left-hand side and as an argument to tr. 
But obviously the problem has a solution, 0 = true/ AT. 

Most unification algorithms decompose a unification problem of the form 



E;A\-H-8if^H-82 


( 4 ) 


E-, Ah 81^82 


( 5 ) 



where H is not a defined constant, otherwise they expand the definition. The 
unification algorithm for the higher-order pattern fragment [DHKP96] which is 
employed in Twelf follows the same technique. But strict definitions do not need 
to be be expanded since, because of injectivity, every unifier 0 of (4) is also a 
unifier of (5) and vice versa. This is expressed in the following theorem. 

Theorem 2 (Most general unifiers). Let d ■. A = M he a strict definition. 
Then the unification problems 

<E;Ah d- Si^d- S 2 

and 

E-, Ah 81^82 

have the same set of solutions. 

Proof. Let 6* be a unifier, satisfying W; A' \- 0 : W; A. 

E',Ah{d- 8 i)[ 0 ] = {d- 82 ) 10 ] 
iff E',Ahd-{ 8 i[ 0 ]) = d-{ 82 [ 0 ]) 
iff E' , A h 8 i[ 0 ] = 82 [ 0 ] 

This guarantees that the unifier determined by the unification algorithm 
which does not expand strict definitions unless the two heads differ, is most 
general. 

In addition, we can extend this algorithm to also treat the occurs-check 
problem correctly: We say that 'E] A \- X yi .. yk ~ M, where X is defined 
in E and yi,..,yk are parameters in A, fails the occurs-check if X has a strict 
occurrence in M (not to be confused with a locally strict one). This is a gener- 
alization of Huet’s original rigid path criterion for non-unifiability by allowing 
some arguments to X. Note also that this definition of occurs-check does not 
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need to expand strict definitions. We show that unification problems which fail 
the occurs-check do not have a unifier. 

Informally, one assumes a solution O for X and then counts the number of 
constructor and parameter occurrences in the normal form of {X yi .. yk)[0] and 
M[0] to arrive at a contradiction, a similar argument as in [Pfc91]. In addition, 
we make use of two further properties. First, rigid positions in the arguments 
are preserved under normalization, and second, meta-variables can never occur 
in the head position of these normal forms. 

The proof of the first property is rather difficult because definitions can be 
nested. In our proof we resolve this problem by first showing the admissibility 
of eliminating definitions and then inductively normalize each defined constant 
starting from the inside out. We write nf(M) for the normal form of an ob- 
ject M, based on /35-conversion. 

Lemma 3 (Admissibility of eliminating definitions). Let a be substitution 
satisfying L'; A' \- a : L; A. Furthermore, let x be in F, y in F' , and M, S, a in 
normal form. 

1. If F', A\~x M , and F'; A' \-y cr{x) then F'; A' \-y nf{M[a]). 

2. If F] A\-j: S and F'; A' \-y a{x) then F'; A' \-y nf{S[a]). 

3. If F hx M and F'; A' \-y a(x) then F'; A' \-y nf{M[a] ■ S). 

4- If F F M and F'] A' \-y S then F'] A' \-y nf{M[a] ■ S). 

Proof. The four parts are proven by simultaneous induction over the given strict- 
ness derivations. 

A direct consequence of the admissibility of eliminating definitions is that the 
property of being strict is preserved under normalization. 

Lemma 4 (Eliminating definitions). 

1. If F;A\-^ M then F;Ah,, nf{M). 

2. If F\ A\~x S then T; Z\ hj, nf(S) . 

3. //r K M then F K nf{M). 

4. IfFhM then F i- nf{M). 

Proof. The proof proceeds by simultaneous induction over the given strictness 
derivations, using Lemma 3. 

To arrive at the contradiction described above, we must ensure that the head 
of a definition is never a meta-variable (the head of a A-term is defined as the 
head of its body). We call such objects rigid. 

Definition 4 (Rigid objects). An object M defined in W, A, where parameters 
in A are not subject to instantiation, is called a rigid object iff head{nf{M)) is 
either a constant or a parameter defined in A. 

The head of a definition, no matter to which arguments it is applied, cannot 
be a meta-variable. 
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Lemma 5 (Head). If d : A = M is a strict definition (F B- M), and a a 
substitution with domain F , then M[<j\ ■ S is a rigid object. 

Proof. By induction over the strictness derivation of F h M. 

The other part of the argument involves counting the number of parameter 
and constructor occurrences in a term M which we write as \M\. It can be 
easily shown that this measure satisfies the following property on the unification 
problem in question. 

Lemma 6 (Size). Let F] A \- X y\ .. yk ^ M be a unification problem, 
where M is strict in X (F; A \~x M ), and 0 be a unifier. Then 

\nf{{X .. y,)[6])\ < \nf{M[6])\ 

Proof. By induction over the strictness derivation F; A \~x M , using Lemma 4. 

The third technical result of our paper can now be stated and proven: If a 
unification problem fails the occurs-check, it cannot have any unifiers. 

Theorem 3 (Occurs-check). Let M be a rigid object, and F a context of 
free variables. Furthermore, let X occur strictly in M (F' A \~x M ). Then the 
unification problem 

F,A'rXyi..yk~M 

has no unifiers. 

Proof. Assume the unification problem fails the occurs-check and has the uni- 
fier 0. By Lemma 6, it follows that 

\nf{{X yi .. y,)[6])\ < \nf{M[0])\ 

but because of Lemma 5 we can show, that 

\nf{{X .. y,)[0])\ < \nf{M[0])\ 

contradicting the assumption that 6* is a unifier. 

Hence, a unification problem which fails the occurs-check does not have any 
unifiers. The occurs-check is also the reason why identity functions are not con- 
sidered strict. An equation X = \A' ■ X would fail the occurs-check but have a 
solution (where X is uninstantiated). 

Therefore, strict definitions can be treated mostly as constructors in a unifi- 
cation algorithm. They must be expanded only in the case of a constant clash at 
the head during decomposition of so-called rigid-rigid equations. The unification 
algorithm remains sound and complete. Note that this observation is indepen- 
dent of whether one uses an algorithm based on Miller’s higher-order patterns 
or Huet’s original algorithm for higher-order unification. 
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6 Conclusion 

We have identified a class of strict notational definitions and analyzed the way 
they interact with algorithms for equality and unification. Notational definitions 
must be expanded only in the case of constant clash. This property can be 
exploited to make many implementations of these algorithms more efficient, while 
preserving completeness and soundness with respect to /3i5-conversion. We also 
presented an algorithm to efficiently check definitions for strictness. 

Many theorem provers rely on an ad hoc treatment of definitions. We believe 
that these systems can benefit from the results in this paper in terms of efficiency 
and robustness. 

In future work we plan to evaluate the concept of strictness empirically in 
our implementation. If warranted by the results, we may investigate partially 
strict definitions, that is, definitions, where some of the argument parameters 
are locally strict and others are not. In such a situation definitions may only 
need to be “partially expanded” , comparing the strict and reducing the non- 
strict argument positions. 
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Abstract. If the classical definition of topological space is analysed at 
the light of an intuitionistic and predicative foundation as Martin-Lof’s 
type theory, one is lead to the notion of basic pair: a pair of sets, con- 
crete points and observables (or formal neighbourhoods), linked by a 
binary relation called forcing. The new discovery is that this is enough 
to introduce the topological notions of open and closed subsets, both in 
the concrete (pointwise) and in the formal (pointfree) sense. Actually, a 
new rich structure arises, consisting of a symmetry between concrete and 
formal and of a logical duality between open and closed. Closed subsets 
are defined primitively, as universal-existential images of subsets along 
the forcing relation, while open subsets are existential-universal images. 
So, in the same way as logic gives a theory of subsets as the extension 
of unary propositional functions over a given set, now logic is seen to 
produce topology if we pass to two sets linked by a relation, that is a 
propositional function with two arguments. 

Usual topological spaces are obtained by adding the condition that the 
extensions of observables form a base for a topology, which is seen to 
be equivalent to distributivity. Formal topologies are then obtained by 
axiomatizing the structure induced on observables, with some improve- 
ments on previous definitions. A morphism between basic pairs is essen- 
tially a pair of relations producing a commutative square: this is thus the 
essence of continuity. Usual continuous functions become a special case. 
This new perspective, which is here called basic picture, starts a new 
phase in constructive topology, where logic and topology are deeply con- 
nected and where the pointwise and the pointfree approach to topology 
can live together. It also brings to the development of topology in a more 
general, nondistributive sense. 



1 Introduction 

If the aim is to develop mathematics within a constructive set theory, topology 
seems to be a good test since it is a field in which foundational problems are 
particularly evident. This is a fortiori true if constructivity is meant in a stricter 
sense to include predicativity, like in Martin-Lof’s constructive type theory [6]. 
In fact, the usual definition of topological space involves a kind of quantification 
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over subsets, which has to be justified predicatively. Moreover, in many well 
known topological spaces the definition of points requires an infinite amount of 
information (one example is given by real numbers) and thus it is not a priori 
granted that the collection of points form a set. 

Such problems are solved in formal topology (see [9] and [11]), which is 
strictly constructive since it is developed fully within Martin-Lof’s type theory 
(henceforth simply type theory). To support intuition, type theory is equipped 
with a notation for subsets (as introduced and justified in [13]); in particular, 
for any set S, U C S means that C7 is a propositional function over S and a eU 
means that a £ S and U{a) is true. 

For our present purposes, the definition of formal topology can be motivated 
as follows. Assume that a topology fiX on a set of points X is given by means 
of a base. This is expressed in type theory as a family of subsets of X indexed 
on a set S', that is a function ext : S — *■ VX; and this is the same (cf. [13]) as 
a binary relation x \\- a which, for x € X and a £ S, says that a is a formal 
neighbourhood of x. The main idea is then to transfer the structure of f2X onto 
the set S, and to this aim S is equipped with some new primitives. A natural 
choice is to add a binary operation • satisfying x \\- a ■ b iS x Iha and x \\- b 
and thus called formal intersection, a distinguished element 1 G S satisfying 
x Ih 1 for any x £ X, and an infinitary relation a <\ U for a £ S and U C S, 
satisfying a < f7 iff {Vx £ X){x Ih a ^ {3b e U){x Ih 6)) and thus called formal 
cover. A unary predicate Pos(a) prop (a £ S) is also added, satisfying Pos(a) iff 
(3a; £ X)(x Ih a) and called the positivity predicate. 

The definition of formal topology is then obtained by expressing the above 
situation in pointfree terms, that is by requiring the structure A = (5, •, 1, <1, Pos) 
to satisfy all the properties of the new primitives •, 1, <3, Pos which can be for- 
mulated without any mention of points of X. This leads to (cf. [9] and [10]); 

A = (S', •, 1, <, Pos) is a formal topology if: 

(S, •, 1) is a commutative monoid; 



< satisfies 



a € U 

renexivity 

a <\U 



, , , a <iU 
transitivity 



{ybeU){b<V) 

a<lV 



■ - left 



a <iU 
a - b^U 



Pos satisfies 



, , a <iU a<\V 



. , Pos(a) a<\U 
monotomcrty 

(for an analytic explanation of such conditions see [11]). 

Any infinitary relation <3 is equivalently presented as an operator on subsets 
AU = {a £ S : a <\U}] then it can be shown that <l is a cover if and only 



, , , Pos(a) a<U 

positivity — 

adU 



196 



Giovanni Sambin and Silvia Gebellato 



if ^ is a closure operator which moreover satisfies distributivity in the form 
A{U ■ V) = A{U) n A{V) (where U ■ V = {b ■ c : b e U,c e V}). Then a formal 
open can be defined as a subset U of S such that U = AU. 

The presence of the positivity predicate Pos (which does not appear explicitly 
in the usual theory of locales, or pointless topology, see e.g. [4]) has sometimes 
been felt as a redundancy; from the above considerations, we see that Pos(a) is 
the only primitive corresponding to an existential quantification over points, and 
it thus becomes a positive pointfree way to express that ext (a) is inhabited. Its 
presence was due (apart from the convenience in the definition of formal points 
and in the treatment of Scott domains [14]) to the expectation of obtaining 
a good definition of formal closed subsets. As we will see here, to obtain this 
not only Pos must be kept, but the way it expresses existential quantification 
over points must be strengthened, thus reaching a binary predicate which is as 
relevant as the formal cover and dual to it, in a sense to be specified below. 

What is the point of the move to pointfree terms? An ideological rejection 
of points altogether is not a far reaching motivation in our opinion; on the 
contrary, we believe that when points form a set, this information should by 
no means be thrown away (two examples: rational numbers and all finite sets). 
The trouble is that in the most interesting examples there is no simple way 
to generate inductively all the points one would like to have. In the case of 
real numbers, this problem was overcome by Dedekind with the introduction of 
Dedekind cuts and by Brouwer with choice sequences. Formal topology allows to 
solve the same problem in more general terms by introducing the abstract notion 
of formal point. Like formal topologies are defined axiomatically by requiring all 
the properties which can be expressed in the pointfree language with • , 1 , <1 and 
Pos, now formal points of a formal topology are defined to be those subsets 
of S which cannot be distinguished, in such language, from subsets of the form 
Ox = {a G S : X Iha}. Note that this idea is exactly the same which led 
Dedekind from rational numbers to cuts, and to the definition of real numbers 
as cuts (cf. [2]). Using the notation adopted above, in a topological space a 
point X satisfies the conditions: 

a; Ih 1 X Ih a • 6 iff a; Ih a and x \\- b 

X \\- a a <\U xlha 

{3beU){x Ih b) Pos(a) 

So, a subset a of S' is said to be a formal point if, after writing a Ih a in place 
of a 3 a (that is a e a), all the above conditions are satisfied with a replacing x; 
we reach in this way the same definition as that given in [9] . 

The collection of formal points over a formal topology A is denoted by Pt{A). 
The structure {Pt{A), 3 , A) is called a formal space. It is type theory which gives 
a precise foundational meaning to the distinction between topological spaces and 
formal spaces, since it refrains from considering Pt{A) a set like any other, and 
in this sense it has favoured the emergence of formal topology itself. 

In a similar way, we now see how a new quite rich structure emerges after 
rejection of the identification of closed subsets with complements of open subsets 
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(which, as we shall see, is an example of the identification of 3^ with ^V). To this 
aim, we have to go back to the simple structure {X, Ih, S) we met above and take 
it as our main object of study. It will be rewarding, mainly from a conceptual 
point of view: it leads to a new perspective on formal topology, which we have 
called basic picture and of which this paper gives a preview. 

The discovery (by Sambin) of binary Pos and hence much of the basic pic- 
ture in December ’95 was indirectly stimulated by discussions with Per Martin- 
Lof on the notion of formal closed subset; morphisms and a correct apprecia- 
tion of symmetry came later (and are due to both authors). The basic picture 
has been presented in several occasions, mainly at Types ’96, Aussois, Decem- 
ber 1996, at the First Workshop on Formal Topology, Padova, October 1997 and 
at WoLLIC ’98, Sao Paulo, July 1998. 

It is a pleasure for us to thank Per Martin-L6f, both for his interest in formal 
closed subsets, which is almost as old as formal topology itself, and for more 
recent discussions, in particular on some of the topics in the last paragraph 
here. We also thank Bernhard Reus for his questions during his visit to Padova 
in Autumn ’98, which helped us to improve exposition. 



2 Basic Pairs 

A structure X = (X, IF, S'), where X and S are arbitrary sets and IF is an 
arbitrary binary relation between them, is here called a basic pair. To help the 
intuition, we may (as in the introduction above) think of as a set of concrete 
points and S as a set of basic formal opens (or observables); x IF a can be 
read as “a is a formal neighborhood of x” or more neutrally as “x forces a” 
and then the relation IF itself is called forcing. This way of reading introduces 
a distinction between the left side, which is called concrete side, and the right 
one, which is called formal side. The relation IF is the way to pass from the 
concrete to the formal side, and conversely. For any a S S', the extension of a is 
the subset of X of all concrete points forcing a, that is exta = {x & X : x IF a}. 
In topological terms, the family of subsets {exta)a^s is of course a sub-base 
for a topology on X, like any family of subsets of X. In general it is not a 
base for a topology, since we do not require that it covers all the set X, that 
is (Vx)(3a)(x IF a), and that the intersection of its members is open, that is 
(Vx)(Va, 6)(x IFa&x IF6— >(3c)(x IF c& extc C exta n extfc)) (cf. [3], p. 26). 

In the other direction, any element x G X on the concrete side determines 
a subset Ox = {a G S : x IF a} on the formal side, which is called the system 
of neighborhoods (or approximations) of x. The picture we have in mind is 
something like: 

The definition of O is immediately extended to any subset A C by defin- 
ing as usual OA = Ox. Spelling out the definition of union of subsets 

(see [13]), we see that OA is just the image of A along IF through an existential 
quantification: 



OA = {a S S' : (3x S Al)(x IFa&xeA)}. 
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approximations 



Because of the option for intuitionism, the image of A obtained through a uni- 
versal quantification is not definable in terms of O and we thus are lead to put 

□ A = {a € S' : (Va; € X)(x Ih a ^ x e A)}. 

So both O and □ are operators on subsets, that is functions from V{X) to V{S). 
The fact that they are given by an existential and universal quantification respec- 
tively is immediately visible by adopting a notation for quantification relativized 
to subsets (as justified in [13]): 

OA = {a G S : (3a; e exta){x e A)} 

□ A = {a S S : (Va; e exta)(x e A)} 

Also the intuition of OA and DA is now clear: in fact a e OA means that ext a 

meets the subset A, while a e DA means that ext a C A. In the other direction, 

also ext is extended to any subset C S' by putting extU = ext a] extU 

is the existential image of U along the inverse of Ih, and is called the extension 
of U. As above, also the universal image restU has to be considered, and is called 
the restriction of U . Using quantifiers relativized to Oa;, the formal definitions 
are: 



extU = {x G X : (3a e Oa;)(a e U)} 

restU = {a; e A : (Va e Oa;)(a e U)} 

A glance at the definitions shows that the definition of the operators ext and 
rest could be obtained from that of O and □, respectively, just by switching the 
role of the sets X and S. In fact, writing as usual Ih” for the inverse of the 
relation Ih, we see that X~ = {S, lh“. A) is still a basic pair, perfectly as good 

as X = (A, Ih, 5); we call X~ the symmetric of X. So the operators ext and rest 

in X are just the same thing as O and □ , respectively, but in its symmetric X~ . 

In purely mathematical terms, OA and □ A give what is sometimes called 
the weak and strong image, respectively, of the subset A along a relation, which 
in this case is Ih . For a relation denoted by r, the notation rA and r~ A, 
respectively, is sometimes used. Symmetrically, extU and restU are just the 
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weak and strong anti-image, respectively, of U along Ih . They are denoted 
by r~U and r*[/, respectively, if the relation is r. Notice that r~U and r*U 
are the same thing as weak and strong image along the relation r~ . Even if the 
mathematical content is exactly the same, to help the intuition we here have 
preferred to adopt a specific terminology and notation, namely O, □, ext, rest, 
for weak and strong (anti-)images along the forcing relation Ih, which according 
to a uniform notation should have been called Ih, lh“ , lh“, I h* respectively. 

Beside the geometrical symmetry between the left side X and the right side S, 
there is also a logical duality clearly present: the definition of O and □ are 
obtained one from the other by interchanging the roles of V with 3 and — > 
with &. The same of course holds for ext and rest. So a picture could be: 



ext 



rest 



symmetric 



o 



s: 

a 



a 

s 



symmetric 



□ 



What is the use of all this structure? We begin by seeing that the topological 
notions of interior and closure are immediately obtained by combinations of the 
four operators O, □, ext, rest. The symmetry of the picture will then produce 
also their pointfree, or formal, versions. 

3 Interior and Closure 

The interior of a subset A of X is usually defined as the set of points of X 
with a neighborhood all contained in A (see for instance [5], pp. 42, 44). In our 
notation, this definition becomes 

intA = {x G X : (3a e Ox)(Vy € X){y \\- a ^ y e A)}, 

and then it is clear that such combination of quantifiers is just the composition 
of ext after □, that is intA = extU A. To our knowledge, this simple and basic 
fact had not been noticed before. 
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As usual we say that A is open if A = intA; but, of course, we cannot 
expect int so defined to be a topological interior operator, since nowhere it has 
been assumed anything telling that the intersection of two open subsets is open. 
However, it can be proved that int is an interior operator, that is 

i. int A C A, 

ii. if A C H then int A C intB, 

iii. int A C intint A, 

for any A, B C X. Condition i. follows immediately from the adjunction 

extU C A iff {7 C □ A, for any U C S and A C X (1) 

by taking {7 to be □ A, condition ii. follows from the fact that the operators ext 
and □ preserve inclusion of subsets and iii. is a consequence of 

□ ext □ A = DA, for any A C X 

which follows easily from (1) above. 

Quite similarly, the usual definition of the closure cl A of a subset A of A 
says that x e cl A ii any neighborhood of x intersects A. In our notation, 

cl A = {x € X \ (Va e Ox){3y e exta){y e A)} 

that is clA = rest O A. It can be proved that cl is a closure operator, that is 

i. A C cl A, 

ii. if A C H then clA C clB, 

iii. cl cl A C cl A, 

for any A,BCX. Like above, the proof is based on the adjunction 

OA C U iff Ac restU, for any A C A and U C S (2) 

and the fact that O and rest preserve inclusion. 

Like we did above with open subsets, we say that A is closed if A = cl A, 
even if cl is not a closure operator in the sense of topology (since the union of 
two closed subsets is not necessarily closed). 

4 Formal Open and Formal Closed Subsets 

Because of the symmetry between the left and the right side of a basic pair 
A — >S, the above definitions of int = extO and cl = restO also have symmetric 
definitions, obtained by replacing each operator with its symmetric: C = <>rest 
and A = □ ext . By symmetry, it is immediate that C is an interior operator and 
A is a closure operator. We now see that actually A is something already known, 
while C is in a sense what we were looking for. In fact, spelling out the definition 
of A, we see that 



a e AU = (Va; S A)(a; Ih a — > {3b e U){x Ih b)) 
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that is, a e AU if all concrete points forcing a also force U, which is the relation 
between a and U which was meant to be expressed by the formal cover a <\U. 
So, as in formal topology, we say that U is formal open li U = AU , even if in 
the wider generality of basic pairs the closure operator A does not necessarily 
satisfy distributivity since X is not equipped with a topology in the traditional 
sense. Such generality, however, allows us to see that A is symmetric to cl, which 
means that the notion of “a being covered by ?7”, that is a e AU, is just the 
symmetric of x e cl A, that is “x is an adherence point of A”; that is, the formula 
defining one notion can be obtained from the other by interchanging points and 
opens. Also this simple fact was, apparently, not noticed before. 

More interesting is the second operator C, which is the novelty emerged, by 
symmetry with int or equivalently by duality with A, from the general study of 
basic pairs. Spelling out its definition, we have 

aeCF={3xGX){x Ih a & (V6 S S')(x \h b ^ b e F)) 

which we can now recognize as a strengthening of the intuitive pointwise inter- 
pretation of the positivity predicate. In fact, since (V6 G S){x Ih 6 — *■ 6 e F) 
is just Ox C F, we see that a e CF means not only that a is inhabited by a 
concrete point x, but also that Ox C F, that is all neighborhoods of such point x 
are elements of F. As we write a <\U for a e AU, we also will write Pos(a, F) for 
a e CF, for any a G S and F C S, and call Pos a binary positivity predicate. The 
previous unary positivity predicate is now obtained as a special case, by putting 
Pos(a) = Pos(a, S). 

The relevance of binary Pos is that it allows to define by symmetry the 
notion of formal closed: we say that a subset F of S' is formal closed if F = CF, 
or equivalently a e F iff Pos(a, F). 

In this way we see that the notions of concrete and formal, open and closed 
subsets are all defined by means of a couple of relativized quantifiers of the 
form 3V or V3 (see the picture below). The logical structure is so evident that 
one could even reverse the perspective and conceive of such topological notions 
as conceptual tools to treat combinations of quantifiers in an intuitive way. 



5 The Isomorphism Theorem 

By definition of int, any concrete open subset A is of the form extU for some 
U C S. Conversely, any subset of X of the form ext U, for any U C S', is concrete 
open, because extU = ext O ext U = intextU. Therefore: 

A C X is concrete open iff A = extU for some U C S 

Quite similarly, one can prove that: 

A C X is concrete closed iff A = restU for some U C S 
U C S is formal open iff C/ = DA for some A C X 
F C S is formal closed iff F = OA for some A C X. 
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It is then easy to see that, when restricted to open subsets (either concrete or 
formal), the operators ext and □ are bijective, and one inverse of the other. 
Similarly for closed subsets, with rest and O. 

It follows from the fact that int is an interior operator that an arbitrary 
union of concrete open subsets is concrete open. Symmetrically, an arbitrary 
union of formal closed subsets is formal closed. Dually, an arbitrary intersection 
of concrete closed (formal open) subsets is concrete closed (formal open) . We can 
as usual define the meet of an arbitrary family of concrete open subsets intAi 
as the interior of the intersection, that is Aig/ intAi = i'nt{{^^^j intAi); dually, 
the join of an arbitrary family of formal open subsets is defined by Vig/ = 
”^(Uig/ AUi). So concrete and formal open subsets form two complete lattices. 
Quite similarly for closed subsets. Then one can prove that: 

Theorem. The operator ext is an isomorphism between the lattice of formal 
open and that of concrete open subsets. Dually, the operator rest is an isomor- 
phism between the lattices of formal closed and of concrete closed subsets. 

This theorem gives further evidence of the correctness of our definitions. In 
particular, it shows that a binary positivity predicate Pos(a, T’), or equivalently 
an interior operator C, is necessary to obtain a predicative notion of formal closed 
subset which corresponds well, as in the case of open subsets, to that of concrete 
closed subset. 

The following picture summarizes most of the information about open and 
closed subsets: 

concrete closed formal open 

cl V3 symmetric A 




int 3V symmetric C 
concrete open formal closed 

Of course, the vertical line at the right refers to the formal side, and at the left 
to the concrete side. Also, the top horizontal line refers to closure operators, and 
the bottom one to interior operators. One diagonal refers to open subsets, the 
other to closed subsets. 

6 Continuity 

What we have seen so far could be summarized by saying that topology begins 
with basic pairs. They are the simplest extension of the notion of set, that is 
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two sets linked in the weakest possible way, namely by a relation. We are now 
going to see that continuity begins with the weakest possible way to link two 
basic pairs, namely a pair of relations giving rise to a commutative square. 

Given two basic pairs X = X — >S and y = Y — >T, we say that a pair of 
relations r \ X ^ Y and s : S' — > T is a morphism, or a relation-pair, from X 
to y if the diagram 

X ^ S 




is commutative. Here we assume that composition of relations is defined as usual; 
then, writing rx for r{a;}, commutativity of the above diagram is expressed by 
the equation 

Ora; = sOa; for any x € X. (3) 

Several motivations lead to consider relations rather than functions and then 
to adopt the above definition of morphisms between basic pairs. First of all, 
relations are more general than functions and they allow to grasp better the 
essence of continuity. Secondly, on one hand we obtain the usual definition of 
continuity for functions as a particular case, but on the other hand we will also 
be able to give a natural constructive definition of topological Kripke structures. 
A third good reason for considering relations is that the inherent symmetry of 
basic pairs is somehow preserved: if (r, s) : df ^ 3^ is a morphism, also the inverse 
(s~,r~) is a morphism, from y~ into X~ . This statement would be impossible 
with functions. 

Given a relation r : X ^ Y, & simple minded extension of the usual definition 
of continuity for functions is to require that r~ is open. Since any open subset 
of Y is of the form exi U for some U CT, this amounts to 

r~{extU) = int(r~ extU) for any f7 C T. 

Since ext distributes arbitrary unions, it is enough to require that 

r~ {extb) = int{r~ extb) for any 6 € T. (4) 

One can see that, putting asb = ext a C r~{extb), such requirement is equivalent 
to (3) above. So, (3) is satisfied when r~ is open, for a suitable choice of s. On 
the other hand, it can easily be proved that if (r, s) is a morphism, then r~ is 
open and s is essentially uniquely determined by r; in fact, if (r, s') is any other 
morphism, then s and s' coincide “topologically”, that is A{s'~b) = y4(s“6) for 
any b € T. In this sense (3) is equivalent to r~ being open; we prefer the former 
for aesthetic reasons. 

An equivalent characterization is reached through a different path. Assume 
we express the fact that r~ is open as: 

for any U CT, there is 14 C S' such that r~{extU) = extV . 
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More constructively, this can be expressed by requiring the existence of a family 
of subsets Vf, C S' for 6 € T such that 

r~ (extb) = extVb for any 6 S T. (5) 

But as we have seen already, the family of subsets (Vb)beT is equivalently repre- 
sented as a relation asb = a e Vb, and then (5) becomes 

r~{extb) = ext{s~b) for any 6 € T. (6) 

It is a matter of fact that (6) is equivalent to (3). Actually, one can prove that 
also 

r* (restF) = rest{s*F) for any F CT, 

□ {r~* A) = s“*(n A) for any A C A 

are equivalent formulations of morphisms. 



7 The Category of Basic Pairs and Related Notions 

Basic pairs and relation-pairs form a category, which we call BP. BP is closely 
related to the category ReP, that is the category whose objects are arrows in 
Rel, the category of sets and relations, and morphisms are indeed defined as 
commutative squares. So, objects of ReP are what we called basic pairs, and 
morphisms what we called relation-pairs. However, the notion of equality of 
arrows is not the same, since two arrows in BP are defined to be equal when 
they behave in the same way “topologically” . So we say that two relation-pairs 
(r, s) and (r', s') are equal when 

sOx = s'<>x for any x £ X 

r~ extb = r'~ extb for any b £ T 

It can be proved that (r, s) is equal to (r', s') exactly when weak and strong 
anti-images along r and r' behave equally on open and on closed subsets of Y, 
respectively (that is, r~ extU = r'~ extU and r*restU = r'*restU for any 
U CT) and s and s' behave equally on open and on closed subsets of S (that is, 
sOA = s'OA and s“*DA = s'~ DA for any A C X). One can show that such 
equality is indeed an equivalence relation and that it is respected by composi- 
tion; in other terms, one can think of BP as a quotient of ReP. So, the usual 
nice tricks with diagrams are possible in BP as they were in ReP. For instance, 
the commutative square of the definition of a morphism (r, s) can be read also 
as a morphism ( Ih, Ih') from the basic pair XCXy into the basic pair S-^T. 

A basic pair is technically also the same thing as a boolean Chu space (see [8]); 
we have chosen to adopt the new name of basic pair to recall that topology is 
now involved and that the underlying set theory is constructive type theory. 
The category BP strictly generalizes the category of boolean Chu spaces, be- 
cause morphisms of Chu spaces are defined as pairs of functions, and in opposite 
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directions. It also provides it with a new topological taste. One can therefore 
expect from BP an even wider range of applications than those developed and 
foreseen by V. Pratt for Chu spaces (see his www page [7]). 

The notion of continuity for relations (sometimes euphemistically called 
“many- valued functions”) has been considered by various authors, particularly 
in the past; a textbook is [1]. Two more recent references are [16], especially sec- 
tion 4.4 where some bibliographic references can also be found, and [17], which 
generalizes^ the notion of continuous relation as introduced in [12]. 

When X and Y are topological spaces, a relation r ■. X is said to be lower 
semi-continuous if r* is closed, that is r* A is closed in X whenever A is closed 
in y, and upper semi-continuous if r* is open (see [16]). Lower semi-continuity 
is classically equivalent to r~ open, and hence to our (4), which does not need 
any free variable on subsets, while the free variable on subsets which is used to 
express upper semi-continuity is not eliminable. This is why we have adopted the 
former as our definition (while continuous relations of [1] are required to satisfy 
both). 

Note that our definition is still sufficient to give the usual definition of con- 
tinuity for functions as a special case when the relation r is actually a function. 



8 Topological Kripke Structures 

In textbooks on modal logic, a Kripke structure is usually defined as a set X 
together with a relation r : X ^ X. Clearly, it is a special case of basic pair (in 
which S = X). We are more interested however in the fact that basic pairs allow 
to introduce a constructive definition of topological Kripke structure in a natural 
way. In fact, we say that {X,r) is a topological Kripke structure if X — >S is 
a basic pair, so that X is topologized by S through Ih, and r : AT ^ X is a 
relation whose inverse r~ is open. In other terms, a topological Kripke structure 
is essentially nothing but a morphism from a basic pair into itself. Then also 
the notion of p-morphism (called contraction in [12]) can now be generalized, 
and described simply as a commutative cube, of which one face is (fb, r) and the 
opposite face is {y, s). 

9 Extending Formal Topology 

To conclude this preview, we can repeat the process described in the introduction 
as a motivation for the definition of formal topology, but now starting from a 
more general situation, given by a basic pair (X, IF, S'). The unfolding of the 
basic picture in the previous pages has shown that to describe in the best possible 
way the concrete topological structure of X by means only of the formal side, we 
have to adopt two primitive relations <1 and Pos or equivalently two operators A 
and C, which will be assumed to be a closure operator and an interior operator 

^ Note that the generalization of [12] given by M. B. Smyth in [17] is in the opposite 
direction of that presented here. 



206 



Giovanni Sambin and Silvia Gebellato 



respectively. When A and C are defined by means of the relation Ih in a basic 
pair, the link between them is automatically given by the fact that A = O ext 
and C = O rest with respect to the same forcing relation. We now have to add a 
condition expressing this with no mention of X, and hence of Ih. We thus arrive 
at 



compatibility 



a e AU a e CV 



{3beU){beCV) 

which is easily seen to hold in any basic pair. We thus reach the following defi- 
nition, which we express in the perfectly equivalent notation with < and Pos to 
underline the relation with the previous definition of formal topology: 



Definition of basic (formal) topology. A triple S = (S', <1, Pos) is called a basic 
formal topology if S is a set, <I and Pos are infinitary relations satisfying: 



a e U 

refiexivity 

a <lU 



... a<\U 
transitivity 



{ybeU){b<\V) 

a<\V 



antirefi. 



Pos(a, F) 
a e F 



Pos(a,F) 
trans. 



(ibe S){Pos{b,F) ^beG) 
Pos(a, G) 



compatibility 



a <U Pos(a, F) 
(36eC/)(Pos(6,F)) 



Due to the complete symmetry of a basic pair, if we transfer the structure of a 
basic pair onto its left-concrete side, rather than conversely as we did above, we 
reach a definition which differs from the above only in notation and terminology. 
That is, we say that (A, cl, int) is a basic concrete topology if A is a set, cl is 
a closure operator and int an interior operator, linked by the condition 

X e cl A X e intB 
{3y e A){y e intB) 

which now has an immediate intuitive content since it characterizes the closure 
of a subset when opens are given by an interior operator. This is a quite simple 
but rich structure, which never came to life before because it was hidden under 
the equalities of classical logic. 

The notion of basic formal topology strictly generalizes the previous definition 
of formal topology in two ways. Since no condition expressing that {exta)aeS is 
a base is present in a basic pair, basic topologies have no condition guaranteeing 
that formal opens form a frame. Since they do form a complete lattice in any case, 
the difference is distributivity. This was previously expressed by the requirement 
A{U ■ V) = AU n AV, or equivalently • - left and • - right. Taking up an idea 
in [15], distributivity can be expressed, even in absence of the primitive operation 
• of formal intersection, by adding the requirement that A{UIV) = AU n AV 
where UIV = {a € S : (3b e U){a <1 {6}) & (3c e V){a <\ {c})}. In the equivalent 
notation with <, A{UIV) = AU (3 AV is expressed by 

, . , a<iU a<V 

i-right 



a<UlV 
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which is sufficent to prove that formal opens form a frame. This offers a sim- 
pler formulation of formal topologies, obtained from the definition given in the 
introduction by suppressing • and 1 and by replacing • - left and • - right with 
|-right. A considerable advantage of this new formulation is that it includes in 
an easier way those examples where a preorder rather than a binary operation 
is immediately available (two examples: the power of a set and trees). 

However, we are more interested at the moment in the presence of a binary, 
rather than unary, positivity predicate Pos, and this is the second generalization. 
In fact, it can be proved that a unary Pos is essentially (apart from the condition 
of positivity, which can be added at will) the same thing as a trivial binary Pos; 
Pos is said to be trivial if Pos(a,F) ^ {\/b){Pos{b, S) b e F) holds, which is 
a constructive way to express that 0 and CS are the only formal closed subsets. 
Thus the new version of formal topology, with a binary Pos, includes the previous 
one as a special case. It also includes the theory of locales, simply as the special 
case with an improper Pos, that is one for which Pos(a, F) is always false. 

In our opinion, what we have presented here is sufficient to conclude that 
the basic picture is indeed the basic perspective for a very general approach to 
constructive topology. The control of distributivity (that is, the fact that it can 
be added on top, in the form of (.-right) opens the way to the development of 
nondistributive topology, in which the formal and the concrete approach seem to 
be mathematically equivalent. The presence of binary Pos permits a predicative 
treatment of formal closed subsets, which now have a primitive definition parallel 
to that of formal open subsets, just as the combination of quantifiers 3V is parallel 
to V3. Of course, much work is still to be done to reach a solid development (we 
have so far extended to the general case, that is nondistributive and with binary 
Pos, a portion of previous formal topology; as an example, arrows between basic 
topologies can be obtained by taking as defining conditions exactly the properties 
of the second component in a relation-pair) . Given the novelty of the underlying 
ideas, however, we would not be very surprised if it will lead to some unexpected 
new applications. 
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